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RESTRICTION MAP 8753-BP ECORI FRAGMENT FROM 1 TO 8753 
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10 20 30 40 50' 60 

GGGCTGCAGG TCGACTCTAG AATCGATGAA GCCTGCGATG AAGGCGGCGA CGAACAGGAA 
CCCGACGTCC AGCTGAGATC TTAGCTACTT CGGACGCTAC TTCCGCCGCT GCTTGTCCTT 

70 80 90 100 110 120 

GGCGAGCAGG TGGAAGGCGA GATCTTGCAC GGCGGGGACT CGAGAGGAGA GCTGTCAGGC 
CCGCTCGTCC ACCTTCCGCT CTAGAACGTG CCGCCCCTGA GCTCTCCTCT CGACAGTCCG 

130 140 150 160 170 180 

GGGATTTTCC GCCTTGTGTC AGAGCCCGGC GCGATTTGCA AAGCCTTCTG TCGCGGTGTT 
CCCTAAAAGG CGGAACACAG TCTCGGGCCG CGCTAAACGT TTCGGAAGAC AGCGCCACAA 

190 200 210 220 230 240 

GCTGTCCATG CAGGTGTCGA AATTGAAAAA CCGACAAAGA TTCAAAGCCT TGTTCCAGCT 
CGACAGGTAC GTCCACAGCT TTAACTTTTT GGCTGTTTCT AAGTTTCGGA ACAAGGTCGA 

250 260 270 280 290 300 

CGCTGTCTTT CTGGATGGAG GCGCTCTCGC CCGCATGGTG CCGAAGAAGG GCTGTCCTTG 
GCGACAGAAA GACCTACCTC CGCGAGAGCG GGCGTACCAC GGCTTCTTCC CGACAGGAAC 

310 320 330 340 350 360 

CGATACGGTA GGCGGATGAC GATCTTCCTC AAACGCGACA TGGCGATGGC GCAATCCGGT 
GCTATGCCAT CCGCCTACTG CTAGAAGGAG TTTGCGCTGT ACCGCTACCG CGTTAGGCCA 

370 380 390 400 410 420 

TTGACCGGCC TTCCGCGCTC CGGTAAAAAT GAAGGATATG CGACGGCGTC CGCTTTGGCG 
AACTGGCCGG AAGGCGCGAG GCCATTTTTA CTTCCTATAC GCTGCCGCAG GCGAAACCGC 

430 440 450 460 470 480 

GACTGAAAGA GCGTCCGGTG CGGCCGACCC AGTCAGGGGG GCATCAGCCG GTGCTGTCCA 
CTGACTTTCT CGCAGGCCAC GCCGGCTGGG TCAGTCCCCC CGTAGTCGGC CACGACAGGT 

490 500 510 520 530 540 

GATCGGCCGG GACGGATCGT CCCAGCCGGC GCTTCGTTAA GGAGAACAAC GAAGGGAGCC 
CTAGCCGGCC CTGCCTAGCA GGGTCGGCCG CGAAGCAATT CCTCTTGTTG CTTCCCTCGG 

550 560 570 580 590 600 

GGCCGCCGAT GCCATCGGGC CAACACTCTG CACAGACGAC GAAAGCAGGA GCCGGGCTGG 
CCGGCGGCTA CGGTAGCCCG GTTGTGAGAC GTGTCTGCTG CTTTCGTCCT CGGCCCGACC 



FIG. 7 A 
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610 620 630 640 650' 660 

TGCTCGGGCT CGGCTGCGAG CGTCGCACGC CGGCCGAAGA GGTGATCGCC CTTGCCGAGC 
ACGAGCCCGA GCCGACGCTC GCAGCGTGCG GCCGGCTTCT CCACTAGCGG GAACGGCTCG 

670 680 690 700 710 720 

GTGCGCTTGC CGATGCCGGT GTTGCGCCCG GCGATCTGCG GCTGGTCGCC TCGCTCGATG 
CACGCGAACG GCTACGGCCA CAACGCGGGC CGCTAGACGC CGACCAGCGG AGCGAGCTAC 

730 740 750 760 770 780 

CTCGCGCCGA GGAGCCGGCG ATCCTGGCGG CCGCTCAGCA TTTCGCGGTT CCGGCCGCGT 
GAGCGCGGCT CCTCGGCCGC TAGGACCGCC GGCGAGTCGT AAAGCGCCAA GGCCGGCGCA 

790 800 810 820 830 840 

TCTACGATGC CGCCACGCTC GAAGCCGAAG CTTCCCGGCT CGCCAACCCG TCCGAGATCG 
AGATGCTACG GCGGTGCGAG CTTCGGCTTC GAAGGGCCGA GCGGTTGGGC AGGCTCTAGC 

850 860 870 880 890 900 

TCTTTGCCTA CACGGGTTGT CATGGCGTTG CCGAGGGTGC AGCGCTCGTC GGCGCCGGTC 
AGAAACGGAT GTGCCCAACA GTACCGCAAC GGCTCCCACG TCGCGAGCAG CCGCGGCCAG 

910 920 930 940 950 960 

GCGAAGCCGT GCTGATTGTG CAGAAGATCG TCTCCGCCCA TGCGACGGCC GCACTTGCCG 
CGCTTCGGCA CGACTAACAC GTCTTCTAGC AGAGGCGGGT ACGCTGCCGG CGTGAACGGC 

970 980 990 1000 1010 1020 

GGCCGGCGAC CTTGCGCGCC GAAAAGCGCA TCCAGGCGGC GGAGGCTGTC TGATGCATTC 
CCGGCCGCTG GAACGCGCGG CTTTTCGCGT AGGTCCGCCG CCTCCGACAG ACTACGTAAG 

1030 1040 1050 1060 1070 1080 

TTATGTTGTT GAATTGAATC AATCTTTTGC CCGGGGTTTC TCTCAAGTGG AATCCGGTTC 
AATACAACAA CTTAACTTAG TTAGAAAACG GGCCCCAAAG AGAGTTCACC TTAGGCCAAG 

1090 1100 1110 1120 1130 1140 

TTTAGAGAGC GCGTCAGGCG TGCCGTTGGG TGGCGCCGAA ATACAGGTGG GACAGCACGC 
AAATCTCTCG CGCAGTCCGC ACGGCAACCC ACCGCGGCTT TATGTCCACC CTGTCGTGCG 

1150 1160 1170 1180 1190 1200 

ATGATCGACG ACCTGTTTGC CGGATTGCCG GCGCTCGAAA AAGGTTCGGT CTGGCTGGTC 
TACTAGCTGC TGGAGAAACG GCCTAACGGC CGCGAGCTTT TTCCAAGCCA GACCGACCAG 

FIG. 7B 
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1210 1220 1230 1240 1250' 1260 

GGCGCCGGCC CCGGCGATCC CGGCCTGTTG ACGCTGCATG CGGCCAATGC GCTGCGCCAG 
CCGCGGCCGG GGCCGCTAGG GCCGGACAAC TGCGACGTAC GCCGGTTACG CGACGCGGTC 

1270 1280 1290 1300 1310 1320 

GCGGATGTGA TCGTGCATGA TGCGCTGGTC AACGAGGATT GCCTGAAGCT CGCGCGGCCG 
CGCCTACACT AGCACGTACT ACGCGACCAG TTGCTCCTAA CGGACTTCGA GCGCGCCGGC 

1330 1340 1350 1360 1370 1380 

GGCGCCGTGC TGGAGTTTGC GGGCAAGCGT GGCGGCAAGC CGTCGCCGAA GCAGCGCGAC 
CCGCGGCACG ACCTCAAACG CCCGTTCGCA CCGCCGTTCG GCAGCGGCTT CGTCGCGCTG 

1390 1400 1410 1420 1430 1440 

ATCTCGCTTC GCCTCGTCGA ACTCGCGCGC GCCGGCAACC GGGTGCTGCG CCTCAAAGGC 
TAGAGCGAAG CGGAGCAGCT TGAGCGCGCG CGGCCGTTGG CCCACGACGC GGAGTTTCCG 

1450 1460 1470 1480 1490 1500 

GGCGATCCCT TCGTCTTCGG TCGCGGTGGC GAGGAGGCGC TGACGCTGGT CGAACACCAG 
CCGCTAGGGA AGCAGAAGCC AGCGCCACCG CTCCTCCGCG ACTGCGACCA GCTTGTGGTC 

1510 1520 1530 1540 1550 1560 

GTGCCGTTCC GAATCGTGCC CGGCATCACC GCCGGTATCG GCGGGCTTGC CTATGCCGGC 
CACGGCAAGG CTTAGCACGG GCCGTAGTGG CGGCCATAGC CGCCCGAACG GATACGGCCG 

1570 1580 1590 1600 1610 1620 

ATTCCCGTGA CCCATCGCGA GGTCAACCAC GCGGTCACTT TCCTGACTGG CCATGATTCC 
TAAGGGCACT GGGTAGCGCT CCAGTTGGTG CGCCAGTGAA AGGACTGACC GGTACTAAGG 

1630 1640 1650 1660 1670 1680 

TCCGGCCTGG TGCCGGATCG CATCAACTGG CAGGGCATCG CCAGCGGCTC GCCTGTCATC 
AGGCCGGACC ACGGCCTAGC GTAGTTGACC GTCCCGTAGC GGTCGCCGAG CGGACAGTAG 

1690 1700 1710 1720 1730 1740 

GTCATGTACA TGGCGATGAA ACATATCGGC GCGATCACCG CCAACCTCAT TGCCGGCGGC 
CAGTACATGT ACCGCTACTT TGTATAGCCG CGCTAGTGGC GGTTGGAGTA ACGGCCGCCG 

1750 1760 1770 1780 1790 1800 

CGCTCGCCGG ACGAACCGGT CGCCTTCGTC TGCAACGCCG CGACGCCGCA GCAGGCGGTG 
GCGAGCGGCC TGCTTGGCCA GCGGAAGCAG ACGTTGCGGC GCTGCGGCGT CGTCCGCCAC 
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2410 2420 2430 2440 2450' 2460 

GGGTGGTTGG CGTCATCCTC AACMGGTCG GCAGCGATCG GCATGAAATG ATGCTGCGCG 
CCCACCAACC GCAGTAGGAG TTGTTCCAGC CGTCGCTAGC CGTACTTTAC TACGACGCGC 

2470 2480 2490 2500 2510 2520 

ATGCGCTCGG CAAGGTGCGC ATGCCTGTCT TCGGCGTGCT CCGGCAGGAC AGCGCATTGC 
TACGCGAGCC GTTCCACGCG TACGGACAGA AGCCGCACGA GGCCGTCCTG TCGCGTAACG 

2530 2540 2550 2560 2570 2580 

AACTGCCGGA GCGCCATCTC GGGCTCGTGC AGGCGGGCGA ACACTCAGCG CTTGAGGGCT 
TTGACGGCCT CGCGGTAGAG CCCGAGCACG TCCGCCCGCT TGTGAGTCGC GAACTCCCGA 

2590 2600 2610 2620 2630 2640 

TCATCGAGGC GGCGGCCGCG CGGGTCGAGG CTGCCTGCGA TCTCGACGCC ATCCGCCTGA 
AGTAGCTCCG CCGCCGGCGC GCCCAGCTCC GACGGACGCT AGAGCTGCGG TAGGCGGACT 

2650 2660 2670 2680 2690 2700 

TCGCGACGAT TTTCCCGCAG GTGCCCGCGG CGGCCGATGC CGAGCGTTTG CGGCCGCTCG 
AGCGCTGCTA AAAGGGCGTC CACGGGCGCC GCCGGCTACG GCTCGCAAAC GCCGGCGAGC 

2710 2720 2730 2740 2750 2760 

GTCAGCGCAT CGCGGTCGCG CGCGATATCG CCTTTGCCTT CTGCTACGAG CACCTGCTTT 
CAGTCGCGTA GCGCCAGCGC GCGCTATAGC GGAAACGGAA GACGATGCTC GTGGACGAAA 

2770 2780 2790 2800 2810 2820 

ACGGCTGGCG GCAAGGCGGC GCGGAGATTT CCTTCTTCTC GCCGCTCGCC GACGAGGGGC 
TGCCGACCGC CGTTCCGCCG CGCCTCTAAA GGAAGAAGAG CGGCGAGCGG CTGCTCCCCG 

2830 2840 2850 2860 2870 2880 

CGGATGCGGC AGCCGATGCC GTCTATCTTC CGGGGGGTTA TCCGGAGCTG CATGCGGGGC 
GCCTACGCCG TCGGCTACGG CAGATAGAAG GCCCCCCAAT AGGCCTCGAC GTACGCCCCG 

2890 2900 2910 2920 -2930 2940 

AGCTGAGCGC CGCCGCCCGA TTCCGTTCCG GCATGCATTC CGCGGCGGAA CGCGGCGCCC 
TCGACTCGCG GCGGCGGGCT AAGGCAAGGC CGTACGTAAG GCGCCGCCTT GCGCCGCGGG 

2950 2960 2970 2980 2990 3000 

GCATCTTCGG CGAGTGCGGC GGCTATATGG TGCTCGGCGA AGGGCTTGTC GCTGCCGATG 
CGTAGAAGCC GCTCACGCCG CCGATATACC ACGAGCCGCT TCCCGAACAG CGACGGCTAC 
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3610 3620 3630 3640 3650' 3660 

TGCCCGCGTG CTTGAAGCGG CCGGCTTTGC TGTCGATCGC GTCGCGGATG CCGACGCGCT 
ACGGGCGCAC GAACTTCGCC GGCCGAAACG ACAGCTAGCG CAGCGCCTAC GGCTGCGCGA 

3670 3680 3690 3700 3710 3720 

CACGGCCGAA CATGGGCTTG TCATCGTCGT CAACCCCAAC AACCCGACCG GCCGCGCCTT 
GTGCCGGCTT GTACCCGAAC AGTAGCAGCA GTTGGGGTTG TTGGGCTGGC CGGCGCGGAA 

3730 3740 3750 3760 3770 3780 

GGCGCCGGCG GAGCTTCTGG CGATCGCCGC AAGGCAGAAG GCGAGCGGCG GACTGCTGCT 
CCGCGGCCGC CTCGAAGACC GCTAGCGGCG TTCCGTCTTC CGCTCGCCGC CTGACGACGA 

3790 3800 3810 3820 3830 3840 

GGTCGATGAG GCCTTCGGCG ATCTTGAGCC GCAACTGAGT GTCGCTGGTC ACGCGTCAGG 
CCAGCTACTC CGGAAGCCGC TAGAACTCGG CGTTGACTCA CAGCGACCAG TGCGCAGTCC 

3850 3860 3870 3880 3890 3900 

GCAAGGCAAC CTCATCGTCT TCCGCTCCTT CGGCAAGTTC TTCGGCCTTG CGGGCCTGCG 
CGTTCCGTTG GAGTAGCAGA AGGCGAGGAA GCCGTTCAAG AAGCCGGAAC GCCCGGACGC 

3910 3920 3930 3940 3950 3960 

CCTCGGCTTC GTCGTTGCGA CCGAGCCAGT GCTTGCATCC TTTGCCGATT GGCTCGGTCC 
GGAGCCGAAG CAGCAACGCT GGCTCGGTCA CGAACGTAGG AAACGGCTAA CCGAGCCAGG 

3970 3980 3990 4000 4010 4020 

CTGGGCTGTC TCCGGCCCGG CGTTGACGAT CTCGAAAGCG CTGATGCAGG GCGATACGAA 
GACCCGACAG AGGCCGGGCC GCAACTGCTA GAGCTTTCGC GACTACGTCC CGCTATGCTT 

4030 4040 4050 4060 4070 4080 

GGCGATCGCG GCGGGCATCC TCGAGCGTCG CGCCGGCCTC GATGCGGCTC TCGATGGGGC 
CCGCTAGCGC CGCCCGTAGG AGCTCGCAGC GCGGCCGGAG CTACGCCGAG AGCTACCCCG 

4090 4100 4110 4120 4130 4140 

AGGGCTCAAC CGTATCGGCG GCACGGGGCT ATTCGTGCTG GTCGAGCATC CCAGGGCAGC 
TCCCGAGTTG GCATAGCCGC CGTGCCCCGA TAAGCACGAC CAGCTCGTAG GGTCCCGTCG 

4150 4160 4170 4180 4190 4200 

TCTGCTGCAG GAGCGGCTCT GCGAGGCCCA TATTCTCACG CGCAAGTTCG ACTATGCCCC 
AGACGACGTC CTCGCCGAGA CGCTCCGGGT ATAAGAGTGC GCGTTCAAGC TGATACGGGG 
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4810 4820 4830 4840 4850' 4860 

GCCTGCCGGG GCTTCTTGCC TACMGATGC TGAACACCGC CGATTCGATG ATCGGCCACA 
CGGACGGCCC CGAAGAACGG ATGTTCTACG ACTTGTGGCG GCTAAGCTAC TAGCCGGTGT 

4870 4880 4890 4900 4910 4920 

AGTCGCCGAA ATATCTGCAC TTCGGCTGGG CCTCGGCCCG ACTCGACGAT CTCGCCAACC 
TCAGCGGCTT TATAGACGTG AAGCCGACCC GGAGCCGGGC TGAGCTGCTA GAGCGGTTGG 

4930 4940 4950 4960 4970 4980 

TGCCGGCAGC GAGGCTCTCG ATCCTTTTGA TCTCAGCCGG TGCGCTGATC CATCGTGGCG 
ACGGCCGTCG CTCCGAGAGC TAGGAAAACT AGAGTCGGCC ACGCGACTAG GTAGCACCGC 



4990 5000 5010 

CCAGCGCCGC CAAGGATGCG CTGACCGTGG 
GGTCGCGGCG GTTCCTACGC GACTGGCACC 



5020 5030 5040 

CCCTTCGCGA CCATGGCCTG CACCGCTCGC 
GGGAAGCGCT GGTACCGGAC GTGGCGAGCG 



5050 5060 5070 5080 5090 5100 

CGAACTCCGG CTGGCCGGAA GCGGCCATGG CCGGCGCGCT CGATCTGCAG CTTGCCGGTC 
GCTTGAGGCC GACCGGCCTT CGCCGGTACC GGCCGCGCGA GCTAGACGTC GAACGGCCAG 

5110 5120 5130 5140 5150 5160 

CGCGGATCTA TGGCGGCGTC AAGGTCAGCG AACCTATGAT CAACGGTCCG GGCCGAGCGG 
GCGCCTAGAT ACCGCCGCAG TTCCAGTCGC TTGGATACTA GTTGCCAGGC CCGGCTCGCC 

5170 ' 5180 5190 5200 5210 5220 

TTGCAACAAG CGAAGACATC GACGCCGGTA TTGCTGTATT TTATGGCGCC TGTACGGTCA 
AACGTTGTTC GCTTCTGTAG CTGCGGCCAT AACGACATAA AATACCGCGG ACATGCCAGT 

5230 5240 5250 5260 5270 5280 

TGGCCGGGTT TGTTCTTGCA ATCGCAATGA TTTGATCGCG GAAGTTGACC TTCGCATTAA 
ACCGGCCCAA ACAAGAACGT TAGCGTTACT AAACTAGCGC CTTCAACTGG AAGCGTAATT 

5290 5300 5310 5320 5330 5340 

GACTCTGCTT TCCATATGTA TTAAGATCGT ATCATATTCG ATCAGTTATT CTCCTGGAAC 
CTGAGACGAA AGGTATACAT AATTCTAGCA TAGTATAAGC TAGTCAATAA GAGGACCTTG 

5350 5360 5370 5380 5390 5400 

GTTTGGTTCC ACCGGTACGT GTTCGTCTTC CCGGAGAGAG AAGCATGCGC AAAAGCTT 
CAAACCAAGG TGGCCATGCA CAAGCAGAAG GGCCTCTCTC TTCGTACGCG TTTTCGAA 

FIG. 71 



Francis BLANCHE et al. 
USAPN: 08/426.630 17 of 189 
Atty. Docket 3806.0050-01 



10 20 30 40 50" 60 

GAATTCGCCA GCGCCTACAT GGCTGACCTC AAGCAGTTCC TCGTGGCCCA GAAGAACGAG 
CTTAAGCGGT CGCGGATGTA CCGACTGGAG TTCGTCAAGG AGCACCGGGT CTTCTTGCTC 

70 80 90 100 110 120 

GGCCGGCAGA TTTTCCCTCG CGGGCCTGAG TATTTTCGCG CGCTCGACCT GACGCCGCTC 
CCGGCCGTCT AAAAGGGAGC GCCCGGACTC ATAAAAGCGC GCGAGCTGGA CTGCGGCGAG 

130 140 150 160 170 180 

GACAAGGTGC GCGTGGTCAT TCTCGGCCAG GATCCCTATC ACGGTGACGG CCAGGCGCAT 
CTGTTCCACG CGCACCAGTA AGAGCCGGTC CTAGGGATAG TGCCACTGCC GGTCCGCGTA 

190 200 210 220 230 240 

GGGCTCTGCT TCAGCGTTCG CCCCGGTGTC CGGACGCCGC CGTCGCTGGT CAACATCTAC 
CCCGAGACGA AGTCGCAAGC GGGGCCACAG GCCTGCGGCG GCAGCGACCA GTTGTAGATG 

250 260 270 280 290 300 

AAGGAACTGA ATACCGATCT CGGTATTCCG CCGGCGCGTC ACGGTTTTCT CGAAAGCTGG 
TTCCTTGACT TATGGCTAGA GCCATAAGGC GGCCGCGCAG TGCCAAAAGA GCTTTCGACC 

310 320 330 340 350 360 

GCAAGGCAGG GCGTGCTGCT TTTGAACAGC GTGCTGACGG TAGAGCGCGG GAACGTGCGT 
CGTTCCGTCC CGCACGACGA AAACTTGTCG CACGACTGCC ATCTCGCGCC CTTGCACGCA 

370 380 390 400 410 420 

CACACCAGGG TCACGGTTGG GAAAAGTTCA CGGATGCGAT CATCCGTGCG GTCAACGAGG 
GTGTGGTCCC AGTGCCAACC CTTTTCAAGT GCCTACGCTA GTAGGCACGC CAGTTGCTCC 

430 440 450 460 470 480 

CCGAGCATCC CGTCGTCTTC ATGCTTTGGG GCTCCTATGC GCAGAAGAAG GCGGCCTTCG 
GGCTCGTAGG GCAGCAGAAG TACGAAACCC CGAGGATACG CGTCTTCTTC CGCCGGAAGC 

490 500 510 520 530 540 

TCGACCGCTC GCGCCATCTT GTCCTGAGGG CACCACATCC GTCGCCGCTC TCAGCCCATT 
AGCTGGCGAG CGCGGTAGAA CAGGACTCCC GTGGTGTAGG CAGCGGCGAG AGTCGGGTAA 

550 560 570 580 590 600 

CCGGCTTTCT CGGCTGCCGG CATTTTTCCC AGGCCAATGC CTTCCTCGAA AGCAAAGGCT 
GGCCGAAAGA GCCGACGGCC GTAAAAAGGG TCCGGTTACG GAAGGAGCTT TCGTTTCCGA 
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610 620 630 

TCGATCCGAT CGACTGGCGG CTGCCGGAAA 
AGCTAGGCTA GCTGACCGCC GACGGCCTTT 
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GCGCGAATGA CGGCTTTGTC GTCGCCCTGA 
CGCGCTTACT GCCGAAACAG CAGCGGGACT 

730 740 750 

AGACGCCCGA ACGAAATGGC GGAGGCGGGC 
TCTGCGGGCT TGCTTTACCG CCTCCGCCCG 
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CGCACCGAAG GCGTCAGCTA TGACGGCAGC 
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1030 1040 1050 
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TAAATGCTTC GCGAAGATAG CTTCCTCAAC 

1090 1100 1110 

TGGGGCGACC CGATGCTCTA TGACAGCACC 
ACCCCGCTGG GCTACGAGAT ACTGTCGTGG 

1150 1160 1170 

GGTGAGGTCG CCTTCGCCTA CGACGTCATT 
CCACTCCAGC GGAAGCGGAT GCTGCAGTAA 
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1810 1820 1830 1840 1850 1860 

CGGCTGCCGA GCGCTTCGGC AATGGCATCA TCGAGATTAC CGCGCGCGGA AACCTGCAGC 
GCCGACGGCT CGCGAAGCCG TTACCGTAGT AGCTCTAATG GCGCGCGCCT TTGGACGTCG 

1870 1880 1890 1900 1910 1920 

TTCGCGGCCT GAGCGCGGCT TCGGTGCCAA GGCTGGCGCA GGCGATCGGC GATGCGGAGA 
AAGCGCCGGA CTCGCGCCGA AGCCACGGTT CCGACCGCGT CCGCTAGCCG CTACGCCTCT 

1930 1940 1950 1960 1970 1980 

TCGCCATTGC CGAGGGGCTC GCGATCGAGG TGCCGCCCCT GGCCGGCATC GACCCGGACG 
AGCGGTAACG GCTCCCCGAG CGCTAGCTCC ACGGCGGGGA CCGGCCGTAG CTGGGCCTGC 

1990 2000 2010 2020 2030 2040 

AGATCGCCGA TCCGCGGCCG ATTGCCACTG AGCTTCGTGA AGCGTTGGAT GTGCGCCAGG 
TCTAGCGGCT AGGCGCCGGC TAACGGTGAC TCGAAGCACT TCGCAACCTA CACGCGGTCC 

2050 2060 2070 2080 2090 2100 

TGCCGTTGAA GCTTGCACCC AAATTATCCG TCGTCATCGA TAGCGGTGGC CGGTTTGGTC 
ACGGCAACTT CGAACGTGGG TTTAATAGGC AGCAGTAGCT ATCGCCACCG GCCAAACCAG 

2110 2120 2130 "2140 2150 2160 

TCGGCGCTGT CGTCGCCGAC ATTCGCCTTC AGGCGGTTTC GACTGTCGCG GGGGTGGCCT 
AGCCGCGACA GCAGCGGCTG TAAGCGGAAG TCCGCCAAAG CTGACAGCGC CCCCACCGGA 

2170 2180 2190 2200 2210 2220 

GGGTGCTGTC GCTTGGCGGC ACGTCAACGA AGGCATCGAG CGTCGGGACG TTGGCCGGCA 
CCCACGACAG CGAACCGCCG TGCAGTTGCT TCCGTAGCTC GCAGCCCTGC AACCGGCCGT 

2230 2240 2250 2260 2270 2280 

ACGCGGTCGT GCCGGCCCTG ATCACCATTC TCGAGAAACT GGCGAGCCTG GGCACGACGA 
TGCGCCAGCA CGGCCGGGAC TAGTGGTAAG AGCTCTTTGA CCGCTCGGAC CCGTGCTGCT 

2290 2300 2310 2320 2330 2340 

TGCGCGGGCG CGATCTGGAC CCGTCGGAAA TCCGCGCGCT CTGTCGCTGT GAGACATCGT 
ACGCGCCCGC GCTAGACCTG GGCAGCCTTT AGGCGCGCGA GACAGCGACA CTCTGTAGCA 

2350 2360 2370 2380 2390 2400 

CCGAACGCCC GGCCGCTCCG CGTTCGGCCG CAATACCCGG CATTCATGCG CTGGGTAACG 
GGCTTGCGGG CCGGCGAGGC GCAAGCCGGC GTTATGGGCC GTAAGTACGC GACCCATTGC 
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2410 2420 2430 2440 2450' 2460 

CCGACACCGT TCTCGGCCTC GGTCTGGCCT TTGCTCAGGT GGAGGCCGCC GCGCTGGCAT 
GGCTGTGGCA AGAGCCGGAG CCAGACCGGA AACGAGTCCA CCTCCGGCGG CGCGACCGTA 

2470 2480 2490 2500 2510 2520 

CCTACCTGCA TCAGGTCCAG GCGCTTGGCG CCAATGCGAT CCGGCTTGCG CCCGGGCACG 
GGATGGACGT AGTCCAGGTC CGCGAACCGC GGTTACGCTA GGCCGAACGC GGGCCCGTGC 

2530 2540 2550 2560 2570 2580 

CCTTCTTCGT CCTCGGCCTT TGCCCCGAGA CCGCGGCTGT GGCGCAGAGC CTGGCAGCGT 
GGAAGAAGCA GGAGCCGGAA ACGGGGCTCT GGCGCCGACA CCGCGTCTCG GACCGTCGCA 

2590 2600 2610 2620 2630 2640 

CACACGGTTT TCGCATTGCC GAGCAGGATC CGCGCAATGC GATCGCCACC TGCGCCGGCA 
GTGTGCCAAA AGCGTAACGG CTCGTCCTAG GCGCGTTACG CTAGCGGTGG ACGCGGCCGT 

2650 2660 2670 2680 2690 2700 

GCAAGGGTTG CGCCTCGGCG TGGATGGAAA CCAAGGGCAT GGCCGAGCGC CTCGTCGAGA 
CGTTCCCAAC GCGGAGCCGC ACCTACCTTT GGTTCCCGTA CCGGCTCGCG GAGCAGCTCT 

2710 2720 2730 2740 2750 2760 

CGGCGCCGGA ATTGCTCGAC GGGTCGCTCA CCGTGCATCT CTCCGGCTGC GCCAAGGGCT 
GCCGCGGCCT TAACGAGCTG CCCAGCGAGT GGCACGTAGA GAGGCCGACG CGGTTCCCGA 

2770 2780 2790 2800 2810 2820 

GCGCCCGGCC GAAGCCGTCC GAACTGACGC TTGTCGGTGC GCCATCAGGA TACGGGCTTG 
CGCGGGCCGG CTTCGGCAGG CTTGACTGCG AACAGCCACG CGGTAGTCCT ATGCCCGAAC 

2830 2840 2850 2860 2870 2880 

TCGTAAATGG GGCTGCCAAT GGCTTGCCAA GCGCCTACAC CGATGAGAAT GGAATGGGAT 
AGCATTTACC CCGACGGTTA CCGAACGGTT CGCGGATGTG GCTACTCTTA CCTTACCCTA 

2890 2900 2910 2920 2930 2940 

CCGCCCTTGC CCGGCTCGGC CGGCTGGTGC GGCAAAACAA AGACGCTGGC GAATCGGCGC 
GGCGGGAACG GGCCGAGCCG GCCGACCACG CCGTTTTGTT TCTGCGACCG CTTAGCCGCG 

2950 2960 2970 2980 2990 3000 

AGTCCTGTCT TACACGGCTC GGAGCTGCGC GCGTCTCGGC AGCGTTCGAA CAGGGATAGA 
TCAGGACAGA ATGTGCCGAG CCTCGACGCG CGCAGAGCCG TCGCAAGCTT GTCCCTATCT 
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3010 3020 3030 3040 3050' 3060 

CATGCCTGAG TATGATTACA TTCGCGATGG CAACGCCATC TACGAGCGTT CCTTCGCCAT 
GTACGGACTC ATACTAATGT AAGCGCTACC GTTGCGGTAG ATGCTCGCAA GGAAGCGGTA 

3070 3080 3090 3100 3110 3120 

CATCCGCGCC GAGGCCGATC TGTCGCGCTT CTCCGAAGAG GAAGCGGATC TGGCTGTGCG 
GTAGGCGCGG CTCCGGCTAG ACAGCGCGAA GAGGCTTCTC CTTCGCCTAG ACCGACACGC 

3130 3140 3150 3160 3170 3180 

CATGGTGCAC GCCTGCGGTT CCGTCGAGGC GACCAGGCAG TTCGTGTTTT CTCCCGATTT 
GTACCACGTG CGGACGCCAA GGCAGCTCCG CTGGTCCGTC AAGCACAAAA GAGGGCTAAA 

3190 3200 3210 3220 3230 3240 

CGTAAGCTCG GCCCGTGCGG CGCTGAAAGC CGGTGCGCCG ATCCTCTGCG ATGCCGAGAT 
GCATTCGAGC CGGGCACGCC GCGACTTTCG GCCACGCGGC TAGGAGACGC TACGGCTCTA 

3250 3260 3270 3280 3290 3300 

GGTTGCGCAC GGTGTCACCC GCGCCCGTCT GCCGGCCGGC AACGAGGTGA TCTGCACGCT 
CCAACGCGTG CCACAGTGGG CGCGGGCAGA CGGCCGGCCG TTGCTCCACT AGACGTGCGA 

3310 3320 3330 3340 3350 3360 

GCGCGATCCT CGCACGCCCG CACTTGCGGC CGAGATCGGC AACACCCGCT CCGCCGCAGC 
CGCGCTAGGA GCGTGCGGGC GTGAACGCCG GCTCTAGCCG TTGTGGGCGA GGCGGCGTCG 

3370 3380 3390 3400 3410 3420 

CCTGAAGCTC TGGAGCGAGC GGCTGGCCGG TTCGGTGGTC GCGATCGGCA ACGCGCCGAC 
GGACTTCGAG ACCTCGCTCG CCGACCGGCC AAGCCACCAG CGCTAGCCGT TGCGCGGCTG 

3430 3440 3450 3460 3470 3480 

GGCGTTGTTC TTCCTCTTGG AAATGCTGCG CGACGGCGCG CCGAAGCCGG CGGCAATCCT 
CCGCAACAAG AAGGAGAACC TTTACGACGC GCTGCCGCGC GGCTTCGGCC GCCGTTAGGA 

3490 3500 3510 3520 3530 3540 

CGGCATGCCC GTCGGTTTCG TCGGTGCGGC GGAATCGAAG GATGCGCTGG CCGAGAACTC 
GCCGTACGGG CAGCCAAAGC AGCCACGCCG CCTTAGCTTC CTACGCGACC GGCTCTTGAG 

3550 3560 3570 3580 3590 3600 

CTATGGCGTT CCCTTCGCCA TCGTGCGCGG CCGCCTCGGC GGGAGTGCCA TGACGGCGGC 
GATACCGCAA GGGAAGCGGT AGCACGCGCC GGCGGAGCCG CCCTCACGGT ACTGCCGCCG 
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3610 3620 3630 3640 3650' 3660 

AGCGCTTAAC TCGCTCGCGA GGCCGGGCCT GTGAGCGGCG TCGGCGTGGG GCGCCTGATC 
TCGCGAATTG AGCGAGCGCT CCGGCCCGGA CACTCGCCGC AGCCGCACCC CGCGGACTAG 

3670 3680 3690 3700 3710 3720 

GGTGTTGGGA CCGGCCCCGG TGATCCGGAA CTTTTGACGG TCAAGGCGGT GAAGGCGCTC 
CCACAACCCT GGCCGGGGCC ACTAGGCCTT GAAAACTGCC AGTTCCGCCA CTTCCGCGAG 

3730 3740 3750 3760 3770 3780 

GGGCAAGCCG ATGTGCTTGC CTATTTCGCC AAGGCCGGGC GAAGCGGTAA CGGCCGCGCG 
CCCGTTCGGC TACACGAACG GATAAAGCGG TTCCGGCCCG CTTCGCCATT GCCGGCGCGC 

3790 3800 3810 3820 3830 3840 

GTGGTCGAGG GTCTGCTGAA GCCCGATCTT GTCGAGCTGC CGCTATACTA TCCGGTGACG 
CACCAGCTCC CAGACGACTT CGGGCTAGAA CAGCTCGACG GCGATATGAT AGGCCACTGC 

3850 3860 3870 3880 3890 3900 

ACCGAAATCG ACAAGGACGA TGGCGCCTAC AAGACCCAGA TCACCGACTT CTACAATGCG 
TGGCTTTAGC TGTTCCTGCT ACCGCGGATG TTCTGGGTCT AGTGGCTGAA GATGTTACGC 

3910 3920 3930 3940 3950 3960 

TCGGCCGAAG CGGTAGCGGC GCATCTTGCC GCCGGGCGCA CGGTCGCCGT GCTCAGTGAA 
AGCCGGCTTC GCCATCGCCG CGTAGAACGG CGGCCCGCGT GCCAGCGGCA CGAGTCACTT 

3970 3980 3990 4000 4010 4020 

GGCGACCCGC TGTTCTATGG TTCCTACATG CATCTGCATG TGCGGCTCGC CAATCGTTTC 
CCGCTGGGCG ACAAGATACC AAGGATGTAC GTAGACGTAC ACGCCGAGCG GTTAGCAAAG 

4030 4040 4050 4060 4070 4080 

CCGGTCGAGG TGATCCCCGG CATTACCGCC ATGTCCGGCT GTTGGTCGCT TGCCGGCCTG 
GGCCAGCTCC ACTAGGGGCC GTAATGGCGG TACAGGCCGA CAACCAGCGA ACGGCCGGAC 

4090 4100 4110 4120 4130 4140 

CCGCTGGTGC AGGGCGACGA CGTGCTCTCG GTGCTTCCGG GCACCATGGC CGAGGCCGAG 
GGCGACCACG TCCCGCTGCT GCACGAGAGC CACGAAGGCC CGTGGTACCG GCTCCGGCTC 

4150 4160 4170 4180 4190 4200 

CTCGGCCGCA GGCTTGCGGA TACCGAAGCC GCCGTGATCA TGAAGGTCGG GCGCAATTTG 
GAGCCGGCGT CCGAACGCCT ATGGCTTCGG CGGCACTAGT ACTTCCAGCC CGCGTTAAAC 
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4210 4220 4230 4240 4250' 4260 

CCGAAGATCC GTCGGGCGCT CGCTGCCTCC GGCCGTCTCG ACCAGGCCGT CTATGTCGAA 
GGCTTCTAGG CAGCCCGCGA GCGACGGAGG CCGGCAGAGC TGGTCCGGCA GATACAGCTT 

4270 4280 4290 4300 4310 4320 

CGCGGCACGA TGAAGAACGC GGCGATGACG GCTCTTGCGG AAAAGGCCGA CGACGAGGCG 
GCGCCGTGCT ACTTCTTGCG CCGCTACTGC CGAGAACGCC TTTTCCGGCT GCTGCTCCGC 

4330 4340 4350 4360 4370 4380 

CCCTATTTCT CGCTGGTGCT CGTTCCCGGC TGGAAGGACC GACCATGACC GGTACGCTCT 
GGGATAAAGA GCGACCACGA GCAAGGGCCG ACCTTCCTGG CTGGTACTGG CCATGCGAGA 

4390 4400 4410 4420 4430 4440 

ATGTCGTCGG TACCGGACCG GGCAGCGCCA AGCAGATGAC GCCGGAAACG GCGGAAGCCG 
TACAGCAGCC ATGGCCTGGC CCGTCGCGGT TCGTCTACTG CGGCCTTTGC CGCCTTCGGC 

4450 4460 4470 4480 4490 4500 

TTGCGGCCGC TCAGGAGTTT TACGGCTACT TTCCCTATCT CGACCGGCTG AACCTCAGAC 
AACGCCGGCG AGTCCTCAAA ATGCCGATGA AAGGGATAGA GCTGGCCGAC TTGGAGTCTG 

4510 4520 4530 4540 4550 4560 

CGGATCAGAT CCGTGTCGCC TCGGACAACC GCGAGGAGCT CGATCGGGCA CAGGTCGCGC 
GCCTAGTCTA GGCACAGCGG AGCCTGTTGG CGCTCCTCGA GCTAGCCCGT GTCCAGCGCG 

4570 4580 4590 4600 4610 4620 

TGACGCGGGC TGCGGCAGGC GTGAAGGTCT GCATGGTCTC CGGTGGCGAT CCCGGTGTCT 
ACTGCGCCCG ACGCCGTCCG CACTTCCAGA CGTACCAGAG GCCACCGCTA GGGCCACAGA 

4630 4640 4650 4660 4670 4680 

TTGCCATGGC GGCCGCCGTC TGCGAGGCGA TCGACAAGGG ACCGGCGGAA TGGAAGTCGG 
AACGGTACCG CCGGCGGCAG ACGCTCCGCT AGCTGTTCCC TGGCCGCCTT ACCTTCAGCC 

4690 4700 4710 4720 4730 4740 

TTGAACTGGT GATCACGCCC GGCGTGACCG CGATGCTCGC CGTTGCCGCC CGCATCGGCG 
AACTTGACCA CTAGTGCGGG CCGCACTGGC GCTACGAGCG GCAACGGCGG GCGTAGCCGC 

4750 4760 4770 4780 4790 4800 

CGCCGCTCGG TCATGATTTC TGTGCGATCT CGCTTTCCGA CAATCTGAAG CCCTGGGAAG 
GCGGCGAGCC AGTACTAAAG ACACGCTAGA GCGAAAGGCT GTTAGACTTC GGGACCCTTC 
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4810 4820 4830 4840 4850' 4860 

TCATCACCCG GCGTCTCAGG CTGGCGGCGG AAGCGGGCTT CGTCATTGCC CTCTACAATC 
AGTAGTGGGC CGCAGAGTCC GACCGCCGCC TTCGCCCGAA GCAGTAACGG GAGATGTTAG 

4870 4880 4890 4900 4910 4920 

CGATCAGCAA GGCGCGGCCC TGGCAGCTCG GTGAGGCCTT CGAGCTTCTG CGCAGCGTTC 
GCTAGTCGTT CCGCGCCGGG ACCGTCGAGC CACTCCGGAA GCTCGAAGAC GCGTCGCAAG 

4930 4940 4950 4960 4970 4980 

TGCCGGCAAG CGTTCCGGTC ATCTTCGGCC GTGCGGCCGG GCGGCCGGAC GAACGGATCG 
ACGGCCGTTC GCAAGGCCAG TAGAAGCCGG CACGCCGGCC CGCCGGCCTG CTTGCCTAGC 

4990 5000 5010 5020 5030 5040 

CGGTGATGCC GCTCGGCGAG GCCGATGCCA ACCGCGCCGA CATGGCGACC TGCGTCATCA 
GCCACTACGG CGAGCCGCTC CGGCTACGGT TGGCGCGGCT GTACCGCTGG ACGCAGTAGT 

5050 5060 5070 5080 5090 5100 

TCGGCTCGCC GGAGACGCGC ATCGTCGAGC GCGACGGCCA ACCCGATCTC GTCTACACAC 
AGCCGAGCGG CCTCTGCGCG TAGCAGCTCG CGCTGCCGGT TGGGCTAGAG CAGATGTGTG 

5110 5120 5130 5140 5150 5160 

CGCGCTTCTA TGCAGGGGCG AGCCAGTGAG CGATGCGGTT GAGTGCCTCG TCGCAACTGC 
GCGCGAAGAT ACGTCCCCGC TCGGTCACTC GCTACGCCAA CTCACGGAGC AGCGTTGACG 

5170 5180 5190 5200 5210 5220 

CGACCGTCGG CACGTCCGCG GGCTTGCGCC GCTCGACCAT GATCACCTCG ATGCCGAGCC 
GCTGGCAGCC GTGCAGGCGC CCGAACGCGG CGAGCTGGTA CTAGTGGAGC TACGGCTCGG 

5230 5240 5250 5260 5270 5280 

GGCGCGCTGC GGCAATCTTG CCGTAGGTGG CGCTGCCACC GCTGTTCTTG GCGACGATCA 
CCGCGCGACG CCGTTAGAAC GGCATCCACC GCGACGGTGG CGACAAGAAC CGCTGCTAGT 

5290 5300 5310 5320 5330 5340 

CATCGATCTG CCGACTCCTG AGCAACGCGG CTTCGTCGGC TTCCGCAAAG GGACCGGTCG 
GTAGCTAGAC GGCTGAGGAC TCGTTGCGCC GAAGCAGCCG AAGGCGTTTC CCTGGCCAGC 

5350 5360 5370 5380 5390 5400 

CCAGGATCGC CTCCTGGTCG GGCAGATTAA GCGGCGGCGT CACCGGATCG ACGCTGCGGA 
GGTCCTAGCG GAGGACCAGC CCGTCTAATT CGCCGCCGCA GTGGCCTAGC TGCGACGCCT 
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5410 


5420 


5430 


5440 


5450 


5460 


TGACGTAGCT 


GTGCTGCGGC 


GCGACCTCGA 


AGTGGAAAGC 


TTCCTGTCGA 


CCTATCGCCA 


ACTGCATCGA 


CACGACGCCG 


CGCTGGAGCT 


TCACCTTTCG 


AAGGACAGCT 


GGATAGCGGT 


5470 


5480 


5490 


5500 


5510 


5520 


GGAAGACGCG 


GCGTCGCCGA 


TCACCGAGCG 


CGCTGACGGC 


CTCGACAACG 


CTATCGACAG 


CCTTCTGCGC 


CGCAGCGGCT 


AGTGGCTCGC 


GCGACTGCCG 


GAGCTGTTGC 


GATAGCTGTC 


5530 


5540 


5550 


5560 


5570 


5580 


CAGTCCAGCG 


GTCGCCAGGC 


AGGGGCACCC 


ATTCCGGTCG 


GCGGAGGGCG 


ATAAGCGCAA 


GTCAGGTCGC 


CAGCGGTCCG 


TCCCCGTGGG 


TAAGGCCAGC 


CGCCTCCCGC 


TATTCGCGTT 


5590 


5600 


5610 


5620 


5630 


5640 


CGCCGGTTCT 


TTGCGCTGCG 


TCCGCGGCGT 


TGTGCGAAAT 


GCGTGCGGCA 


AAGGGGTGCG 


GCGGCCAAGA 


AACGCGACGC 


AGGCGCCGCA 


ACACGCTTTA 


CGCACGCCGT 


TTCCCCACGC 


5650 


5660 


5670 


5680 


5690 


5700 


TCGCATCGAC 


CAGCAGCGCG 


ATGTTTTCGT 


CATGCACGAA 


ATGCGCCAGC 


CCATCGGCGC 


AGCGTAGCTG 


GTCGTCGCGC 


TACAAAAGCA 


GTACGTGCTT 


TACGCGGTCG 


GGTAGCCGCG 


5710 


5720 


5730 


5740 


5750 


5760 


CGCCAAAGCC 


GCCGATGCGC 


GTCTTGACCG 


GCTGCGGCCG 


CGGGTCCGCG 


GTGCGGCCGG 


GCGGTTTCGG 


CGGCTACGCG 


CAGAACTGGC 


CGACGCCGGC 


GCCCAGGCGC 


CACGCCGGCC 


5770 


5780 


5790 


5800 


5810 


5820 


CCAGCGAGAT 


GGCGGTGTCG 


TAGCGGACAT 


CTTCGGCCAA 


GCGGCGCGCG 


AGTTCGCGTG 


GGTCGCTCTA 


CCGCCACAGC 


ATCGCCTGTA 


GAAGCCGGTT 


CGCCGCGCGC 


TCAAGCGCAC 


5830 


5840 


5850 


5860 


5870 


5880 


CCTCGGTGGT 


GCCACCCAGA 


ATCAGAATAC 


GAGGTTTTTC 


CATGGCTGAC 


GTGTCGAACA 


GGAGCCACCA 


CGGTGGGTCT 


TAGTCTTATG 


CTCCAAAAAG 


GTACCGACTG 


CACAGCTTGT 


5890 


5900 


5910 


5920 


5930 


5940 


GCGAACCCGC 


CATAGTCTCC 


CCCTGGCTGA 


CCGTCATCGG 


TATCGGTGAG 


GATGGTGTAG 


CGCTTGGGCG 


GTATCAGAGG 


GGGACCGACT 


GGCAGTAGCC 


ATAGCCACTC 


CTACCACATC 


5950 


5960 


5970 


5980 


5990 


6000 


CGGGTCTCGG 


CGACGAGGCC 


AAGCGGCTGA 


TCGCCGAAGC 


GCCGGTCGTC 


TACGGCGGCC 


GCCCAGAGCC 


GCTGCTCCGG 


TTCGCCGACT 


AGCGGCTTCG 


CGGCCAGCAG 


ATGCCGCCGG 
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6010 6020 6030 6040 6050' 6060 

ATCGTCATCT GGAGCTCGCC GCCTCCCTCA TCACCGGCGA AGCGCACAAT TGGCTAAGCC 
TAGCAGTAGA CCTCGAGCGG CGGAGGGAGT AGTGGCCGCT TCGCGTGTTA ACCGATTCGG 

6070 6080 6090 6100 6110 6120 

CCCTCGAACG CTCGGTCGTC GAGATCGTCG CGCGTCGCGG CAGCCCGGTG GTGGTGCTTG 
GGGAGCTTGC GAGCCAGCAG CTCTAGCAGC GCGCAGCGCC GTCGGGCCAC CACCACGAAC 

6130 6140 6150 6160 6170 6180 

CCTCGGGCGA CCCGTTCTTC TTCGGCGTCG GCGTGACGCT GGCGCGCCGC ATCGCCTCGG 
GGAGCCCGCT GGGCAAGAAG AAGCCGCAGC CGCACTGCGA CCGCGCGGCG TAGCGGAGCC 

6190 6200 6210 6220 6230 6240 

CCGAAATACG CACGCTTCCG GCGCCGTCGT CGATCAGTCT TGCCGCCTCG CGCCTCGGCT 
GGCTTTATGC GTGCGAAGGC CGCGGCAGCA GCTAGTCAGA ACGGCGGAGC GCGGAGCCGA 

6250 6260 6270 6280 6290 6300 

GGGCGCTGCA GGATGCGACG CTCGTCTCCG TACATGGGCG GCCGCTGGAT CTGGTGCGAC 
CCCGCGACGT CCTACGCTGC GAGCAGAGGC ATGTACCCGC CGGCGACCTA GACCACGCTG 

6310 6320 6330 6340 6350 6360 

CGCATTTGCA TCCGGGGGCG CGTGTGCTTA CGCTCACGTC GGACGGTGCG GGTCCGCGAG 
GCGTAAACGT AGGCCCCCGC GCACACGAAT GCGAGTGCAG CCTGCCACGC CCAGGCGCTC 

6370 6380 6390 6400 6410 6420 

ACCTTGCCGA GCTTCTGGTT TCAAGCGGCT TCGGTCAGTC GCGACTGACC GTGCTCGAAG 
TGGAACGGCT CGAAGACCAA AGTTCGCCGA AGCCAGTCAG CGCTGACTGG CACGAGCTTC 

6430 6440 6450 6460 6470 6480 

CGCTGGGCGG CGCCGGCGAA CGGGTGACGA CGCAGATCGC CGCGCGCTTC ATGCTCGGCC 
GCGACCCGCC GCGGCCGCTT GCCCACTGCT GCGTCTAGCG GCGCGCGAAG TACGAGCCGG 

6490 6500 6510 6520 6530 6540 

TCGTGCATCC TTTGAACGTC TGCGCCATTG AGGTGGCGGC CGACGAGGGC GCGCGCATCC 
AGCACGTAGG AAACTTGCAG ACGCGGTAAC TCCACCGCCG GCTGCTCCCG CGCGCGTAGG 

6550 6560 6570 6580 6590 6600 

TGCCGCTTGC CGCCGGCCGC GACGATGCGC TGTTCGAACA TGACGGGCAG ATCACCAAGC 
ACGGCGAACG GCGGCCGGCG CTGCTACGCG ACAAGCTTGT ACTGCCCGTC TAGTGGTTCG 
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6610 


6620 


6630 


6640 


6650 


6660 


GCGAGGTGCG 


GGCGCTGACG 


CTGTCGGCAC 


TCGCACCGCG 


CAAGGGCGAA 


CTGCTATGGG 


CGCTCCACGC 


CCGCGACTGC 


GACAGCCGTG 


AGCGTGGCGC 


GTTCCCGCTT 


GACGATACCC 


6670 


6680 


6690 


6700 


6710 


6720 


ACATCGGCGG 


CGGCTCCGGC 


TCGATCGGCA 


TCGAATGGAT 


GCTCGCCGAT 


CCGACCATGC 


TGTAGCCGCC 


GCCGAGGCCG 


AGCTAGCCGT 


AGCTTACCTA 


CGAGCGGCTA 


GGCTGGTACG 


6730 


6740 


6750 


6760 


6770 


6780 


AGGCGATCAC 


CATCGAGGTT 


GAGCCGGAGC 


GGGCAGCGCG 


CATCGGCCGC 


AACGCGACGA 


TCCGCTAGTG 


GTAGCTCCAA 


CTCGGCCTCG 


CCCGTCGCGC 


GTAGCCGGCG 


TTGCGCTGCT 


6790 


6800 


6810 


6820 


6830 


6840 


TGTTCGGCGT 


GCCCGGGCTG 


ACGGTTGTCG 


AAGGCGAGGC 


GCCGGCGGCG 


CTTGCCGGCC 


ACAAGCCGCA 


CGGGCCCGAC 


TGCCAACAGC 


TTCCGCTCCG 


CGGCCGCCGC 


GAACGGCCGG 


6850 


6860 


6870 


6880 


6890 


6900 


TGCCACAACC 


GGACGCGATC 


TTCATCGGCG 


GCGGCGGCAG 


CGAAGACGGC 


GTCATGGAAG 


ACGGTGTTGG 


CCTGCGCTAG 


AAGTAGCCGC 


CGCCGCCGTC 


GCTTCTGCCG 


CAGTACCTTC 


6910 


6920 


6930 


6940 


6950 


6960 


CAGCGATCGA 


GGCGCTCAAG 


TCAGGCGGAC 


GGCTGGTTGC 


CAACGCGGTG 


ACGACGGACA 


GTCGCTAGCT 


CCGCGAGTTC 


AGTCCGCCTG 


CCGACCAACG 


GTTGCGCCAC 


TGCTGCCTGT 


6970 


6980 


6990 


7000 


7010 


7020 


TGGAAGCGGT 


GCTGCTCGAT 


CATCACGCGC 


GGCTCGGCGG 


TTCGCTGATC 


CGCATCGATA 


ACCTTCGCCA 


CGACGAGCTA 


GTAGTGCGCG 


CCGAGCCGCC 


AAGCGACTAG 


GCGTAGCTAT 


7030 


7040 


7050 


7060 


7070 


7080 


TCGCGCGTGC 


AGGACCCATC 


GGCGGCATGA 


CCGGCTGGAA 


GCCGGCCATG 


CCGGTCACCC 


AGCGCGCACG 


TCCTGGGTAG 


CCGCCGTACT 


GGCCGACCTT 


CGGCCGGTAC 


GGCCAGTGGG 


7090 


7100 


7110 


7120 


7130 


7140 


AATGGTCGTG 


GACGAAGGGC 


TAAAGCAGTT 


CCAGCGAAAG 


TGTGACGCGG 


TTTTGCGTCC 


TTACCAGCAC 


CTGCTTCCCG 


ATTTCGTCAA 


GGTCGCTTTC 


ACACTGCGCC 


AAAACGCAGG 


7150 


7160 


7170 


7180 


7190 


7200 


GGAACTGCGC 


AAGAAAAAGA 


AAGAGTAACC 


TATGACGGTA 


CATTTCATCG 


GCGCCGGCCC 


CCTTGACGCG 


TTCTTTTTCT 


TTCTCATTGG 


ATACTGCCAT 


GTAAAGTAGC 


CGCGGCCGGG 



FIG. 8L 



Francis BLANCHE et al 
USAPN: 08/426,630 29 of 189 
Atty. Docket 3806.0050-01 



7210 7220 7230 7240 7250' 7260 

GGGAGCCGCA GACCTGATCA CGGTGCGTGG TCGCGACCTG ATCGGGCGCT GCCCGGTCTG 
CCCTCGGCGT CTGGACTAGT GCCACGCACC AGCGCTGGAC TAGCCCGCGA CGGGCCAGAC 

7270 7280 7290 7300 7310 7320 

CCTTTACGCC GGCTCGATCG TCTCGCCGGA GCTGCTGCGA TATTGCCCGC CGGGCGCCCG 
GGAAATGCGG CCGAGCTAGC AGAGCGGCCT CGACGACGCT ATAACGGGCG GCCCGCGGGC 

7330 7340 7350 7360 7370 7380 

CATTGTCGAT ACGGCGCCGA TGTCCCTCGA CGAGATCGAG GCGGAGTATG TGAAGGCCGA 
GTAACAGCTA TGCCGCGGCT ACAGGGAGCT GCTCTAGCTC CGCCTCATAC ACTTCCGGCT 

7390 7400 7410 7420 7430 7440 

AGCCGAAGGG CTCGACGTGG CGCGGCTTCA TTCGGGCGAC CTTTCGGTCT GGAGTGCTGT 
TCGGCTTCCC GAGCTGCACC GCGCCGAAGT AAGCCCGCTG GAAAGCCAGA CCTCACGACA 

7450 7460 7470 7480 7490 7500 

GGCCGAACAG ATCCGCCGGC TCGAGAAGCA TGGCATCGCC TATACGATGA CGCCGGGCGT 
CCGGCTTGTC TAGGCGGCCG AGCTCTTCGT ACCGTAGCGG ATATGCTACT GCGGCCCGCA 

7510 7520 7530 7540 7550 7560 

TCCTTCCTTT GCGGCGGCGG CTTCAGCGCT CGGTCGCGAA TTGACCATTC CGGCCGTGGC 
AGGAAGGAAA CGCCGCCGCC GAAGTCGCGA GCCAGCGCTT AACTGGTAAG GCCGGCACCG 

7570 7580 7590 7600 7610 7620 

CCAGAGCCTG GTGCTGACCC GCGTTTCGGG CCGCGCCTCG CCGATGCCGA ACTCAGAAAC 
GGTCTCGGAC CACGACTGGG CGCAAAGCCC GGCGCGGAGC GGCTACGGCT TGAGTCTTTG 

7630 7640 7650 7660 7670 7680 

GCTTTCCGCT TTCGGCGCTA CGGGATCGAC GCTGGCAATC CACCTTGCGA TCCATGCGCT 
CGAAAGGCGA AAGCCGCGAT GCCCTAGCTG CGACCGTTAG GTGGAACGCT AGGTACGCGA 

7690 7700 7710 7720 7730 7740 

TCAGCAGGTG GTCGAGGAAC TGACGCCGCT CTACGGTGCC GACTGCCCGG TCGCCATCGT 
AGTCGTCCAC CAGCTCCTTG ACTGCGGCGA GATGCCACGG CTGACGGGCC AGCGGTAGCA 

7750 7760 7770 7780 7790 7800 

CGTCAAGGCC TCCTGGCCGG ACGAACGCGT GGTGCGCGGC ACGCTCGGTG ACATCGCCGC 
GCAGTTCCGG AGGACCGGCC TGCTTGCGCA CCACGCGCCG TGCGAGCCAC TGTAGCGGCG 

FIG. 8M 
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7810 7820 7830 7840 7850' 7860 

CAAGGTGGCG GAAGAGCCGA TCGAGCGCAC GGCGCTGATC TTCGTCGGTC CGGGGCTCGA 
GTTCCACCGC CTTCTCGGCT AGCTCGCGTG CCGCGACTAG AAGCAGCCAG GCCCCGAGCT 

7870 7880 7890 7900 7910 7920 

AGCCTCCGAT TTCCGTGAAA GCTCGCTCTA CGATCCCGCC TATCAGCGGC GCTTCAGAGG 
TCGGAGGCTA AAGGCACTTT CGAGCGAGAT GCTAGGGCGG ATAGTCGCCG CGAAGTCTCC 

7930 7940 7950 7960 7970 7980 

GCGCGGCGAA TAGGCCGCAC TCCCTCGGGG GTCGGCCTAA GTTTCCCGCT GAGAGGGTTT 
CGCGCCGCTT ATCCGGCGTG AGGGAGCCCC CAGCCGGATT CAAAGGGCGA CTCTCCCAAA 

7990 8000 8010 8020 8030 8040 

TGAAACCTAT TCTGCCGGTT CTTCGCGCGG CGGCCGCTGC TTGAGCGGGA CGCCGCGCTT 
ACTTTGGATA AGACGGCCAA GAAGCGCGCC GCCGGCGACG AACTCGCCCT GCGGCGCGAA 

8050 8060 8070 8080 8090 8100 

TTCCTCGACG CGGTCGCGGT AGAGCGCTGC CTGTCCAAGC AGCATCAGCG TCACCGGCGT 
AAGGAGCTGC GCCAGCGCCA TCTCGCGACG GACAGGTTCG TCGTAGTCGC AGTGGCCGCA 

8110 8120 8130 8140 8150 8160 

GGTGGCGACG ACGAAGACGA TGATCAGGAT TTCGTGGAAT ACCCAGCGGC TCTGCAGCAC 
CCACCGCTGC TGCTTCTGCT ACTAGTCCTA AAGCACCTTA TGGGTCGCCG AGACGTCGTG 

8170 8180 8190 8200 8210 8220 

GGCAAAGCAG ATGATAGAGG CGGCGCAGAT CATCAGTACG CCGCCGCTGG TCGCCAGCGT 
CCGTTTCGTC TACTATCTCC GCCGCGTCTA GTAGTCATGC GGCGGCGACC AGCGGTCGCA 

8230 8240 8250 8260 8270 8280 

CGGTGCGTGC AGGCGCTCGT AGAAGCTGGT GAACCGGAGC AAGCCGACGG AGCCGATCAG 
GCCACGCACG TCCGCGAGCA TCTTCGACCA CTTGGCCTCG TTCGGCTGCC TCGGCTAGTC 

8290 8300 8310 8320 8330 8340 

CGCCACTGCG GCGCCGAGGA CGGTGAGCCC GCAGACGAGA ACGGCTGCCC AGACGGGAAG 
GCGGTGACGC CGCGGCTCCT GCCACTCGGG CGTCTGCTCT TGCCGACGGG TCTGCCCTTC 

8350 8360 8370 8380 8390 8400 

GTCGGTGAGG TGGCTCATTC GATGATCTCC CCGCGCATCA GGAACTTGCC GAAGGCGATC 
CAGCCACTCC ACCGAGTAAG CTACTAGAGG GGCGCGTAGT CCTTGAACGG CTTCCGCTAG 

FIG. 8N 
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8410 8420 8430 8440 8450' 8460 

GACGAGACGA AGCCGATCAA AGCCACGATC AGGGCGGACT CGAAATAGAG CGAGTTGGCC 
CTGCTCTGCT TCGGCTAGTT TCGGTGCTAG TCCCGCCTGA GCTTTATCTC GCTCAACCGG 

8470 8480 8490 8500 8510 8520 

GTGCGGATGC CGAAGGTCAA GAGCATCAGC ATGGCGTTGA TATAGAGCGT GTCGAGGCCG 
CACGCCTACG GCTTCCAGTT CTCGTAGTCG TACCGCAACT ATATCTCGCA CAGCTCCGGC 

8530 8540 ' 8550 8560 8570 8580 

AGGATACGGT CCTGGGCGCG CGGTCCCCTC ACCATGCGAT AGAAGGCAAA AGCCATCGCC 
TCCTATGCCA GGACCCGCGC GCCAGGGGAG TGGTACGCTA TCTTCCGTTT TCGGTAGCGG 

8590 8600 8610 8620 8630 8640 

AGGCCGAGCA TGATCTGGGC AATCAGGATC GACCAGATGA TTGAAAGTTC CATCATCCGA 
TCCGGCTCGT ACTAGACCCG TTAGTCCTAG CTGGTCTACT AACTTTCAAG GTAGTAGGCT 

8650 8660 8670 8680 8690 8700 

ATATCTCCTT CAGGGCGGTC TCATAGCGCT TGACCGTATC GAGCCAGATG TCCTCGTTCT 
TATAGAGGAA GTCCCGCCAG AGTATCGCGA ACTGGCATAG CTCGGTCTAC AGGAGCAAGA 

8710 8720 8730 8740 8750 8760 

CCATGTCGAG CACGTGGAAG AGCAGGGACT TGCGGCCGCG ATCCGGGGAA TTC 
GGTACAGCTC GTGCACCTTC TCGTCCCTGA ACGCCGGCGC TAGGCCCCTT AAG 
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SEQUENCE: 5398-BP FRAGMENT FROM 1 TO 1200 LENGTH =1200 BP N 



frames: 



7^ 



FRAME 2 



FRAME 1 - 




I ATG(549) TGA(1011) . 






1119 1239 1359 1479 1599 1719 1839 1959 2879 



CS 
FRAMES 



CS 
FRAME 2 



CS 
FRAME 1 




A = ATG 
*: STOP 
CS: COMPLEMENTARY STRAND 



OPEN READING FRAME 1 
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\ 

SEQUENCE: 5398-BP FRAGMENT FROM 1000 TO 2200 LENGTH =1201 BP 



FRAME 3 



FRAME 2: 



FRAME 1 : 




■rwA, 




I II I 



ATG(1141) 



OJU. 




TGA(1981) 



1119 1239 1359 1479 1599 1719 1839 1959 2879 



CS 
FRAME 3 



CS 
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CS 
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* : STOP 
CS: COMPLEMENTARY STRAND 
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\ 

SEQUENCE: 5398-BP FRAGMENT FROM 1800 TO 3400 LENGTH =1601 BP 

CLUSTER C 



frames: 



FRAME 2 ! 



frame 1 : 





1959 2119 2279 2439 2599 2759 2919 3079 3239 



CS 
FRAME 3 



CS 
FRAME 2 



CS 
FRAME 1 




I II I 




A = ATG 
* : STOP 
CS: COMPLEMENTARY STRAND 



OPEN READING FRAME 3 
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SEQUENCE: 5398-BP FRAGMENT FROM 3000 TO 4500 LENGTH =1501 BP 
CLUSTER C 



X 



FRAMES: 



FRAME 2 



FRAME 1 : 





^G(328l') TGA(428oK 


III i i 


'Ar\ a" i\ aA'k ' 




AA A/ 



3149 3299 3449 3599 3749 3899 4049 4199 4349 



CS 
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CS 
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III 1 1 1 1 J 













A = ATG 
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SEQUENCE: 5398-BP FRAGMENT FROM 3800 TO 5398 LENGTH =1599 BP 



FRAMES 



FRAME 2 i 
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I II III I 
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1^ 



1119 1239 1359 1479 1599 1719 1839 1959 2879 
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1— 1 ^ ' 
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SEQUENCE: 8753-BP FRAGMENT FROM 650 TO 1650 LENGTH = 1001 



\ 



FRAME 1^ 



FRAME 2: 



FRAMES 



CS 



CS 



CS 











1:\^/ATG(736) 








I ' ' ' ' 






T(3A(1519) 




1 1 1 


1 


III 1 1 /! 

A r/. 






1 




749 849 


949 1049 1149 1249 1349 


1449 1549 


1 N^^^ 


I It 


1 1 I 


1 II 




3 1 i < 

^^^^ - - 


\ 1 1 

^.^/V^ 


II f 1 





OPEN READING FRAME 6 

A= ATG 
*: STOP 

CS: COMPLEMENTARY STRAND 



A 
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\ 



SEQUENCE: 8753-BP FRAGMENT FROM 1400 TO 3100 LENGTH = 1701 



FRAME 1 



FRAME 2 



FRAMES 




I I 



1569 1739 1909 2079 2249 2419 2589 2759 2929 



' lA 




OPEN READING FRAME 7 

A = ATG 
*: STOP 

CS: COMPLEMENTARY STRAND 
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\ 



FRAME 1 : 



SEQUENCE: 8753-BP FRAGMENT FROM 2700 TO 3700 LENGTH = 1001 

IXZ 



FRAME 2: 



FRAME 3: 




rATG(3002) 



. . TGA(3632 





II i I 



2799 2899 2999 3099 3199 3299 3399 3499 3599 



— ^ 



CS 

FRAME 1 
CS 

FRAME 2 
CS 

FRAMES 



1 1^ 











OPEN READING FRAME 8 

A= ATG 
*: STOP 

CS: COMPLEMENTARY STRAND 
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\ 



SEQUENCE: 8753-BP FRAGMENT FROM 3500 TO 4500 LENGTH = 1001 



FRAME 1 



FRAME 2 



FRAMES 





V 




^^366(TGA) 


/ GTGf3631) \i 


■I — / " ...... \ / 


: 1 1 1 1 


A 








1 


II i 


1 



3599 3699 3799 3899 3999 4099 4199 4299 4399 



CS 

FRAME 1 
CS 

FRAME 2 
CS 

FRAMES 























1 1 1 









OPEN READING FRAME 9 



A= ATG 
*: STOP 

CS: COMPLEMENTARY STRAND 
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SEQUENCE: 8753-BP FRAGMENT FROM 4150 TO 5150 LENGTH = 1001 



\ 



FRAME 1: 



FRAME 2 



FRAME 3 




4249 4349 4449 4549 4649 4749 4849 4949 5049 



CS 

FRAME 1 
CS 

FRAME 2 
CS 

FRAMES 



OPEN READING FRAME 10 



A= ATG 
*: STOP 

CS: COMPLEMENTARY STRAND 



1 1^ 1 f\y\^ 


L — -^l Lr^**^ /\/\ 




■ Ml 






1 II 1 1 1 1 





:A 



:A 
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SEQUENCE: 8753 - BP FRAGMENT FROM 5000 TO 6000 LENGTH = 1001 



\ 



FRAME 1 



FRAME 2 ; 



FRAMES 




5099 5199 5299 5399 5499 5599 5699 5799 5899 



OS 

FRAME 1 
CS 

FRAME 2 

CS 

FRAME 3 





A = ATG 
: STOP 

*CS: COMPLEMENTARY STRAND 




OPEN READING FRAME 11 
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SEQUENCE: 8753 - BP FRAGMENT FROM 5700 TO 7200 LENGTH = 1501 



\ 



FRAME 1 




A rA L 



I I I 




FRAME 2; , 

L 



nil I I 



FRAME 3 ; 





III I 



III I 



ATG (5862) 




5849 5999 6149 6299 6449 6599 6749 6899 7049 



OS 

FRAME 1 



OS 

FRAME 2 
CS 

FRAME 3 



/ L.yfji^»ri y..,...\i/,...L jc^.„. 

■ • " 


...1 '/^V^ /' .\/V^«\ 1 1 


III 101 


1 1 III 1 


— 





OPEN READING FRAME 12 



A = ATG 
:STOP 

* CS: COMPLEMENTARY STRAND 
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SEQUENCE: 8753 - BP FRAGMENT FROM 7000 TO 8000 LENGTH = 1001 



\ 



FRAME 1 



FRAME 2 



FRAME 3 ; 




7099 7199 7299 7399 7499 7599 7699 7799 7899 



CS 

FRAME 1 3 



CS 

FRAME 2 



CS 

FRAME 3 3 




OPEN READING FRAME 13 

A = ATG 
* :STOP 

CS: COMPLEMENTARY STRAND 
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cobA GENE (SEQ ID NO: 3) AND COBA PROTEIN (SEQ ID NO: 4) 
SEQUENCE OF THE 5396-BP Clal-Hindlll-Hindlll-Hindlll FRAGMENT 

FROM 1141 TO 1980 

MetlleAspAspLeuPheAlaGlyLeuProAlaLeuGluLysGlySerValTrpLeuValGlyAlaGlyProGly 
ATGATCGACGACCTCTTTGCCGGATTGCCGGCGCTCGAAAAAGGTTCGGTCTGGCTGGTCGGCGCCGGCCCCGGC 

1141 1151 1161 1171 1181 1191 1201 
AspProGlyLeuLeuThrLeuHisAlaAlaAsnAlaLeuArgGlnAlaAspVallleValHisAspAlaLeuVal 
GATCCCGGCCTGTTGACGCTGCATGCGGCCAATGCGCTGCGCCAGGCGGATGTOCGTGCATGATGCG(M 

1216 1226 1236 1246 1256 1266 1276 
AsnGluAspCysLeuLysLeuAlaArgProGlyAlaValLeuGluPheAlaGlyLysArgGlyGlyLysProSer 
AACGAGGATTGCCTGAAGCTCGCGCGGCCGGGCGCCGTGCTGGAGTTTGCGGGCAAGCGTGGCGGCAAGCCGTCG 

1291 1301 1311 1321 1331 1341 1351 
ProLysGlnArgAspIleSerLeuArgLeuValGluLeuAlaArgAlaGlyAsnArgValLeuArgLeuLysGly 
CCGAAGCAGCGCGACATCTCGCTTCGCCTCGTCGAACTCGCGCGCGCCGGCAACCGGGT^CTGCGCCTCAAAGGC 

1366 1376 1386 1396 1406 1416 1426 
GlyAspProPheValPheGlyArgGlyGlyGluGluAlaLeuThrLeuValGluHisGlnValProPheArglle 
GGCGATCCCTTCGTCTTCGGTCGCGGTGGCGAGGAGGCGCTGACGCTGGTCGAACACCAGGTGCCGTTCCGAATC 

1441 1451 1461 1471 1481 1491 1501 
ValProGlylleThrAlaGlylleGlyGlyLeuAlaTyrAlaGlylleProValThrHisArgGluValAsnHis 
GTGCCCGGCATCACCGCCGGTATCGGCGGGCTTGCCTATGCCGGCATTCCCGTGACCCATCGCGAGGTCAACCAC 

1516 1526 1536 1546 1556 1566 1576 
AlaValThrPheLeuThrGlyHisAspSerSerGlyLeuValProAspArglleAsnTrpGlnGlylleAlaSer 
GCGGTCACTTTCCTGACTGGCCATGATTCCTCCGGCCTGGTGCCGGATCGCATCAACTGGCAGGGCATCGCCAGC 

1591 1601 1611 1621 1631 1641 1651 
GlySerProVallleValMetTyrMetAlaMetLysHisIleGlyAlalleThrAlaAsnLeuIleAlaGlyGly 
GGCTCGCCTGTCATCGTCATGTACATGGCGATGAAACATATCGGCGCGATCACCGCCAACCTCATTGCCGGCGGC 

1666 1676 1686 1696 1706 1716 1726 
ArgSerProAspGluProValAlaPheValCysAsnAlaAlaThrProGlnGlnAlaValLeuGluThrThrLeu 
CGCTCGCCGGACGAACCGGTCGCCTTCGTCTGCAACGCCGCGACGCCGCAGCAGGCGGTCCTGGAAACGACGCTT 

1741 1751 1761 1771 1781 1791 1801 
AlaArgAlaGluAlaAspValAlaAlaAlaGlyLeuGluProProAlalleValValValGlyGluValValArg 
GCGCGTGCAGAGGCCGATGTTGCGGCGGCAGGGCTGGAGCCGCCGGCGATCGTCGTCGTCGGCGAGGT^TGCGG 

1816 1826 1836 1846 1856 1866 1876 
LeuArgAlaAlaLeuAspTrpIleGlyAlaLeuAspGlyArgLysLeuAlaAlaAspProPheAlaAsnArglle 
CTGCGCGCAGCGCTCGACTGGATCGGCGCGCTGGACGGGCGCAAGCTTGCCGCCGACCCGTTCGCCAATCGCATT 

1891 1901 1911 1921 1931 1941 1951 

LeuArgAsnProAla*** 
CTCAGGAACCCGGCATGA 

1966 1976 1986 1996 2006 2016 2026 
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NAME = COBA 



1 

1 


PHE 


F 


8 


2 


LEU 


L 


31 


3 


ILE 


I 


16 


4 


MET 


M 


4 


5 


VAL 


V 


27 


6 


SER 


S 


8 


7 


PRO 


P 


19 


8 


THR 


T 


10 


9 


ALA 


A 


41 


10 


TYR 


Y 


2 


11 


* 


* 


0 


12 


His 


H 


7 


13 


GLN 


Q 


6 


14 


ASN 


N 


9 


15 


LYS 


K 


8 


16 


ASP 


D 


15 


17 


GLU 


E 


13 


18 


CYS 


C 


2 


19 


TRP 


W 


3 


20 


ARG 


R 


19 


21 


GLY 


G 


32 


22 






0 



FIRST RESIDUE = 1 
LAST RESIDUE = 280 

NUMBER NO. % WEIGHT 

2.86 1176.56 

11.07 3505.48 

5.71 1809.28 

1.43 524.16 

9.64 2674.89 

2.86 696.24 

6.79 1843.95 

3.57 1010.50 

14.64 2912.64 

0.71 326.12 

0.00 0.00 

2.50 959.42 

2.14 768.36 

3.21 1026.36 

2.86 1024.72 

5.36 1725.45 

4.64 1677.52 

0.71 206.02 

1.07 558.24 

6.79 2965.90 

11.43 1824.64 

0.00 0.00 



WEIGHT % 

4.02 
11.99 
6.19 
1.79 
9.15 
2.38 
6.31 
3.46 
9.96 
1.12 
0.00 
3.28 
2.63 
3.51 
3.51 
5.90 
5.74 
0.70 
1.91 
10.15 
6.24 
0.00 



RESIDUES = 280 

MOLECULAR WEIGHT = 29234 

INDEX OF POLARITY (%) = 34* 

ISOELECTRIC POINT = 7 5i 

OD 260 (Img/ml) = 0.464 OD 280 (Img/ml) = 0.652 

HYDROPHILICITY PROFILE OF THE COBA PROTEIN 

COBA FROM 1 TO 280 



1.50 




-1.50 



28 56 84 112 140 168 196 224 252 280 
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cobB GENE (SEQ ID NO: 5) AND COBB PROTEIN (SEQ ID NO: 6) 
SEQUENCE OF THE 5396-BP Clal-Hindlll-Hindlll-Hindlll FRAGMENT 

FROM 1980 TO 3281 

MetSerGlyLeuLeuIleAlaAlaProAlaSerGlySerGlyLysThrThrValThrLeuGlyLeuMetArgAla 
ATGAGCGGATTGCTGATTGCCGCACCCGCGTCCGGCTCCGGCAAGACGACGGTGACGCTCGGGCTGATGCGCGCC 

1980 1990 2000 2010 2020 2030 2040 

LeuLysArgArgGlyValAlalleAlaProGlyLysAlaGlyProAspTyrlleAspProAlaPheHisAlaAla 
CTGAAGAGGCGCGGCGTGGCGATCGCGCCCGGCAAGGCGGGGCCGGACTATATCGATCCCGCTTTCCACGCGGCA 

2055 2065 2075 2085 2095 2105 2115 

AlaThrGlyGluProCysPheAsnTyrAspProTrpAlaMetArgProGluLeuLeuLeuAlaAsnAlaSerHis 
GCGACCGGCGAGCCCTGCTTCAACTACGACCCCTGGGCGATGCGCCCGGAACTGCTGCTTGCCAATGCGTCGCAT 

2130 2140 2150 2160 2170 2180 2190 

ValAlaSerGlyGlyArgThrLeuIleValGluAlaMetMetGlyLeuHisAspGlyAlaAlaAspGlySerGly 
GTGGCCTCCGGCGGGCGCACATTGATCGTCGAGGCGATGATGGGACTGCATGACGGTGCTGCCGACGGCTCGGGA 

2205 2215 2225 2235 2245 2255 2265 

ThrProAlaAspLeuAlaAlaThrLeuAsnLeuAlaVallleLeuValValAspCysAlaArgMetSerGlnSer 
ACGCCAGCGGACCTCGCCGCGACGCTGAACCTTGCGGTCATTCTGGTGGTCGATTGCGCCCGCATGTCCCAGTCG 

2280 2290 2300 2310 2320 2330 2340 

ValAlaAlaLeuValArgGlyTyrAlaAspHisArgAspAspIleArgValValGlyVallleLeuAsnLysVal 
GTTGCCGCCCTCGTGCGCGGCTATGCGGATCATCGCGACGATATCCGGGTGGTTGGCGTCATCCTCAACAAGGTC 

2355 2365 2375 2385 2395 2405 2415 

GlySerAspArgHisGluMetMetLeuArgAspAlaLeuGlyLysValArgMetProValPheGlyValLeuArg 
GGCAGCGATCGGCATGAAATGATGCTGCGCGATGCGCTCGGCAAGGTGCGCATGCCTGTCTTCGGCGTGCTCCGG 

2430 2440 2450 2460 2470 2480 2490 

GlnAspSerAlaLeuGlnLeuProGluArgHisLeuGlyLeuValGlnAlaGlyGluHisSerAlaLeuGluGly 
CAGGACAGCGCATTGCAACTGCCGGAGCGCCATCTCGGGCTCGTGCAGGCGGGCGAACACTCAGCGCTTGAGGGC 

2505 2515 2525 2535 2545 2555 2565 

PhelleGluAlaAlaAlaAlaArgValGluAlaAlaCysAspLeuAspAlalleArgLeuIleAlaThrllePhe 
TTCATCGAGGCGGCGGCCGCGCGGGTCGAGGCTGCCTGCGATCTCGACGCCATCCGCCTGATCGCGACGATTTTC 

2580 2590 2600 2610 2620 2630 2640 

ProGlnValProAlaAlaAlaAspAlaGluArgLeuArgProLeuGlyGlnArglleAlaValAlaArgAspIle 
CCGCAGGTGCCCGCGGCGGCCGATGCCGAGCGTTTGCGGCCGCTCGGTCAGCGCATCGCGGTCGCGCGCGATATC 

2655 2665 2675 2685 2695 2705 2715 

AlaPheAlaPheCysTyrGluHisLeuLeuTyrGlyTrpArgGlnGlyGlyAlaGluIleSerPhePheSerPro 
GCCTTTGCCTTCTGCTACGAGCACCTGCTTTACGGCTGGCGGCAAGGCGGCGCGGAGATTTCCTTCTTCTCGCCG 

2730 2740 2750 2760 2770 2780 2790 

LeuAlaAspGluGlyProAspAlaAlaAlaAspAlaValTyrLeuProGlyGlyTyrProGluLeuHisAlaGly 
CTCGCCGACGAGGGGCCGGATGCGGCAGCCGATGCCGTCTATCTTCCGGGGGGTTATCCGGAGCTGCATGCGGGG 

2805 2815 2825 2835 2845 2855 2865 
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GlnLeuSerAlaAlaAlaArgPheArgSerGlyMetHisSerAlaAlaGluArgGlyAlaArgllePheGlyGlu 
CAGCTGAGCGCCGCCGCCCGATTCCGTTCCGGCATGCATTCCGCGGCGGAACGCGGCGCCCGCATCTTCGGCGAG 
2880 2890 2900 2910 2920 2930 2940 

CysGlyGlyTyrMetValLeuGlyGluGlyLeuValAlaAlaAspGlyThrArgTyrAspMetLeuGlyLeuLeu 
TGCGGCGGCTATATGGTGCTCGGCGAAGGGCTTGTCGCTGCCGATGGCACACGCTACGACATGCTCGGCCTGCTG 

2955 2965 2975 2985 2995 3005 3015 

ProLeuValThrSerPheAlaGluArgArgArgHisLeuGlyTyrArgArgValValProValAspAsnAlaPhe 
CCGCTCGTAACCAGTTTTGCCGAGCGCAGGCGGCACCTCGGCTATCGCCGCGTCGTGCCTGTCGACAACGCCTTC 

3030 3040 3050 3060 3070 3080 3090 

PheAspGlyProMetThrAlaHisGluPheHisTyrAlaThrlleValAlaGluGlyAlaAlaAspArgLeuPhe 
TTCGATGGACCCATGACGGCGCACGAATTCCACTATGCGACCATCGTCGCCGAAGGGGCGGCCGATCGGCTGTTT 

3105 3115 3125 3135 3145 3155 3165 

AlaValSerAspAlaAlaGlyGluAspLeuGlyGlnAlaGlyLeuArgArgGlyProValAlaGlySerPheMet 
GCGGTCAGCGACGCCGCCGGCGAGGATCTCGGCCAGGCGGGCCTCCGGCGCGGCCCTGTCGCCGGTTCCTTCATG 

3180 3190 3200 3210 3220 3230 3240 
HisLeuIleAspValAlaGlyAlaAla*** 
CATCTGATCGACGTCGCAGGTGCTGCATGA 

3255 3265 3275 3285 3295 3305 3315 
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NAME = COBB 



FIRST RESIDUE = 1 
LAST RESIDUE = 434 









NUMBER 


NO. % 


WEIGHT 


WEIGHT 


1 


PHE 


F 


17 


3.92 


2500.19 


5.47 


2 


LEU 


L 


45 


10.37 


5088.60 


11.14 


3 


ILE 


I 


17 


3.92 


1922.36 


4.21 


4 


MET 


M 


14 


3.23 


1834.56 


4.02 


5 


VAL 


V 


31 


7.14 


3071.17 


6.72 


6 


SER 


S 


19 


4.38 


1653.57 


3.62 


7 


PRO 


P 


21 


4.84 


2038.05 


4.46 


8 


THR 


T 


12 


2.76 


1212.60 


2.65 


9 


ALA 


A 


76 


17.51 


5399.04 


11 .82 


10 


TYR 


Y 


11 


2.53 


1793. 66 


3.93 


11 


* 


*■ 


0 


0.00 


0.00 


0.00 


12 


HIS 


H 


14 


3.23 


1918.84 


4.20 


13 


GLN 


Q 


9 


2.07 


1152.54 


2.52 


14 


ASN 


N 


5 


1.15 


570.20 


1.25 


15 


LYS 


K 


5 


1.15 


640.45 


1.40 


16 


ASP 


D 


28 


6.45 


3220.84 


7.05 


17 


GLU 


E 


21 


4.84 


2709.84 


5.93 


18 


CYS 


C 


5 


1.15 


515.05 


1.13 


19 


TRP 


W 


2 


0.46 


372.16 


0.81 


20 


ARG 


R 


34 


7.83 


5307.40 


11.62 


21 


GLY 


G 


48 


11.06 


2736.96 


5.99 


22 






0 


0.00 


0.00 


0.00 



RESIDUES = 434 

MOLECULAR WEIGHT = 45676. 

INDEX OF POLARITY (%) = 34. 

ISOELECTRIC POINT = 6.47 

OD 260 (Img/ml) = 0.351 OD 280 (Img/ml) = 0.529 



2.10 



HYDROPHILICITY PROFILE OF THE COBB PROTEIN 
COBB FROM 1 TO 434 




-2.10 



43 86 129 172 215 258 301 344 387 430 
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cobC GENE (SEQ ID NO: 7) AND COBC PROTEIN (SEQ ID NO: 8) 
SEQUENCE OF THE 5396-BP Clal-Hindlll-Hindlll-Hindlll FRAGMENT 

FROM 3281 TO 4279 

MetSerAlaProIleValHisGlyGlyGlylleThrGluAlaAlaAlaArgTyrGlyGlyArgProGluAspTrp 
ATGAGCGCACCGATCGTTCATGGTGGCGGCATCACCGAGGCCGCAGCGCGCTATGGCGGCCGGCCTGAAGACTGG 
3281 3291 3301 3311 3321 3331 3341 

LeuAspLeuSerThrGlylleAsnProCysProValAlaLeuProAlaValProGluArgAlaTrpHisArgLeu 
CTCGATCTGTCGACCGGCATCAATCCATGCCCCGTCGCCTTGCCCGCGGTCCCTGAGCGCGCCTGGCACCGGCTG 
3356 3366 3376 3386 3396 3406 3416 

ProAspArgGlnThrValAspAspAlaArgSerAlaAlaAlaAspTyrTyrArgThrAsnGlyValLeuProLeu 
CCGGATCGGCAGACGGTAGATGATGCGCGGAGCGCCGCCGCCGACTACTACCGCACCAACGGCGTGCTGCCTTTG 
3431 3441 3451 3461 3471 3481 3491 

ProValProGlyThrGlnSerVallleGlnLeuLeuProArgLeuAlaProAlaAsnArgHisValAlallePhe 
CCGGTGCCGGGCACCCAGTCGGTGATCCAGCTCCTGCCACGTCTTGCTCCGGCCAACAGGCACGTCGCGATTTTC 

3506 3516 3526 3536 3546 3556 3566 

GlyProThrTyrGlyGluTyrAlaArgValLeuGluAlaAlaGlyPheAlaValAspArgValAlaAspAlaAsp 
GGGCCGACCTATGGCGAGTATGCCCGCGTGCTTGAAGCGGCCGGCTTTGCTGTCGATCGCGTCGCGGATGCCGAC 

3581 3591 3601 3611 3621 3631 3641 

AlaLeuThrAlaGluHisGlyLeuVallleValValAsnProAsnAsnProThrGlyArgAlaLeuAlaProAla 
GCGCTCACGGCCGAACATGGGCTTGTCATCGTCGTCAACCCCAACAACCCGACCGGCCGCGCCTTGGCGCCGGCG 
3656 3666 3676 3686 3696 3706 3716 

GluLeuLeuAlalleAlaAlaArgGlnLysAiaSerGlyGlyLeuLeuLeuValAspGluAlaPheGlyAspLeu 
GAGCTTCTGGCGATCGCCGCAAGGCAGAAGGCGAGCGGCGGACTGCTGCTGGTCGATGAGGCCTTCGGCGATCTT 
3731 3741 3751 3761 3771 3781 3791 

GluProGlnLeuSerValAlaGlyHisAlaSerGlyGlnGlyAsnLeuIleValPheArgSerPheGlyLysPhe 
GAGCCGCAACTGAGTGTCGCTGGTCACGCGTCAGGGCAAGGCAACCTCATCGTCTTCCGCTCCTTCGGCAAGTTC 
3806 3816 3826 3836 3846 3856 3866 

PheGlyLeuAlaGlyLeuArgLeuGlyPheValValAlaThrGluProValLeuAlaSerPheAlaAspTrpLeu 
TTCGGCCTTGCGGGCCTGCGCCTCGGCTTCGTCGTTGCGACCGAGCCAGTGCTTGCATCCTTTGCCGATTGGCTC 
3881 3891 3901 3911 3921 3931 3941 

GlyProTrpAlaValSerGlyProAlaLeuThrlleSerLysAlaLeuMetGlnGlyAspThrLysAlalleAla 
GGTCCCTGGGCTGTCTCCGGCCCGGCGTTGACGATCTCGAAAGCGCTGATGCAGGGCGATACGAAGGCGATCGCG 
3956 3966 3976 3986 3996 4006 4016 

AlaGlylleLeuGluArgArgAlaGlyLeuAspAlaAlaLeuAspGlyAlaGlyLeuAsnArglleGlyGlyThr 
GCGGGCATCCTCGAGCGTCGCGCCGGCCTCGATGCGGCTCTCGATGGGGCAGGGCTCAACCGTATCGGCGGCACG 
4031 4041 4051 4061 4071 4081 4091 

GlyLeuPheValLeuValGluHisProArgAlaAlaLeuLeuGlnGluArgLeuCysGluAlaHisIleLeuThr 
GGGCTATTCGTGCTGGTCGAGCATCCCAGGGCAGCTCTGCTGCAGGAGCGGCTCTGCGAGGCCCATATTCTCACG 

4106 4116 4126 4136 4146 4156 4166 

ArgLysPheAspTyrAlaProThrTrpLeuArgValGlyLeuAlaProAspAlaAlaGlyAspArgArgLeuAla 
CGCAAGTTCGACTATGCCCCGACCTGGCTCAGGGTCGGTCTTGCGCCTGACGCGGCTGGTGACCGACGGCTGGCG 

4181 4191 4201 4211 4221 4231 4241 
AspAlaLeuAlaArgMe tGluLeu * * * 
GACGCGCTTGCCCGCATGGAGCTCTGA ' 

4256 4266 4276 4286 4296 4306 4316 
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NAME = COBC 



1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 



FIRST RESIDUE = 1 
LAST RESIDUE = 333 



18 



22 







NUMBER 


NO. % 


PHE 


F 


11 


3.30 


LEU 


L 


43 


12.91 


ILE 


I 


13 


3.90 


MET 


M 


3 


0.90 


VAL 


V 


24 


7.21 


SER 
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RESIDUES 

MOLECULAR WEIGHT 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 
OD 260 (Img/ml) = 0.670 



OD 280 



1.80 



HYDROPHILICITY PROFILE OF THE 
COBC FROM 1 TO 333 



WEIGHT . WEIGHT 

1617.77 4.62 

4862.44 13.90 

1470.04 4.20 

393.12 1.12 

2377.68 6.79 

957.33 2.74 

2232.15 6.38 
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0.00 0.00 
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0.00 0.00 

333 
34992. 
34. 

— 6 72 
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COBC PROTEIN 




-1.80 
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cobD GENE (SEQ ID NO: 9) AND COBD PROTEIN (SEQ ID NO: 10) 
SEQUENCE OF THE 5396-BP Clal-Hindlll-Hindlll-Hindlll FRAGMENT 

FROM 4284 TO 5252 

MetSerGluThrlleLeuLeuUeLeuAlaLeuAlaLeuVallleAspArgValValGlyAspProAspTrpLeu 
GTGTCGGAGACGATCCTGCTCATTCTCGCGCTGGCGCTGGTGATCGACCGCGTTGTCGGCGATCCGGACTGGCTC 
4284 4294 4304 4314 4324 4334 4344 

TrpAlaArgValProHisProValValPhePheGlyLysAlalleGlyPhePheAspAlaArgLeuAsnArgGlu 
TGGGCGCGCGTGCCGCATCCGGTCGTGTTTTTCGGCAAGGCCATCGGCTTTTTCGACGCGCGGCTGAACCGGGAG 

4359 4369 4379 4389 4399 4409 4419 

AspLeuGluAspSerAlaArgLysPheArgGlyValValAlalleLeuLeuLeuLeuGlylleSerAlaTrpPhe 
GACCTCGAGGATAGCGCGCGCAAATTTCGTGGCGTCGTCGCGATCCTTTTGTTGCTTGGCATCAGCGCCTGGTTC 

4434 4444 4454 4464 4474 4484 4494 

GlyHisLeuLeuHisArgLeuPheAlaValLeuGlyProLeuGlyPheLeuLeuGluAlaValLeuValAlaVal 
GGCCATCTGCTGCATCGCCTGTTCGCCGTCCTCGGACCGCTCGGCTTTCTGCTCGAGGCGGTTCTGGTCGCGGTC 

4509 4519 4529 4539 4549 4559 4569 

PheLeuAlaGlnLysSerLeuAlaAspHisValArgArgValAlaGlyGlyLeuArgGlnGlyGlyLeuGluGly 
TTCCTGGCACAGAAGAGCCTCGCCGATCACGTGCGTCGCGTGGCCGGGGGCTTGCGACAGGGCGGGCTGGAAGGC 

4584 4594 4604 4614 4624 4634 4644 

GlyArgAlaAlaValSerMetlleValGlyArgAspProLysThrLeuAspGluProAlaValCysArgAlaAla 
GGGCGTGCCGCCGTGTCGATGATCGTTGGTCGCGATCCAAAGACGCTCGACGAGCCGGCGGTCTGCCGTGCCGCG 

4659 4669 4679 4689 4699 4709 4719 

IleGluSerLeuAlaGluAsnPheSerAspGlyValValAlaProAlaPheTrpTyrAlaValAlaGlyLeuPro 
ATCGAAAGCCTTGCCGAGAATTTCTCCGACGGCGTCGTGGCGCCGGCCTTCTGGTACGCGGTTGCCGGCCTGCCG 

4734 4744 4754 4764 4774 4784 4794 

GlyLeuLeuAlaTyrLysMetLeuAsnThrAlaAspSerMetlleGlyHisLysSerProLysTyrLeuHisPhe 
GGGCTTCTTGCCTACAAGATGCTGAACACCGCCGATTCGATGATCGGCCACAAGTCGCCGAAATATCTGCACTTC 

4809 4819 4829 4839 4849 4859 4869 

GlyTrpAlaSerAlaArgLeuAspAspLeuAlaAsnLeuProAlaAlaArgLeuSerlleLeuLeuIleSerAla 
GGCTGGGCCTCGGCCCGACTCGACGATCTCGCCAACCTGCCGGCAGCGAGGCTCTCGATCCTTTTGATCTCAGCC 

4884 4894 4904 4914 4924 4934 4944 

GlyAlaLeuIleHisArgGlyAlaSerAlaAlaLysAspAlaLeuThrValAlaLeuArgAspHisGlyLeuHis 
GGTGCGCTGATCCATCGTGGCGCCAGCGCCGCCAAGGATGCGCTGACCGTGGCCCTTCGCGACCATGGCCTGCAC 

4959 4969 4979 4989 4999 5009 5019 

ArgSerProAsnSerGlyTrpProGluAlaAlaMetAlaGlyAlaLeuAspLeuGlnLeuAlaGlyProArglle 
CGCTCGCCGAACTCCGGCTGGCCGGAAGCGGCCATGGCCGGCGCGCTCGATCTGCAGCTTGCCGGTCCGCGGATC 

5034 5044 5054 5064 5074 5084 5094 

TyrGlyGlyValLysValSerGluProMetlleAsnGlyProGlyArgAlaValAlaThrSerGluAspIleAsp 
TATGGCGGCGTCAAGGTCAGCGAACCTATGATCAACGGTCCGGGCCGAGCGGTTGCAACAAGCGAAGACATCGAC 

5109 5119 5129 5139 5149 5159 5169 

AlaGlylleAlaValPheTyrGlyAlaCysThrValMetAlaGlyPheValLeuAlalleAlaMetlle*** 
GCCGGTATTGCTGTATTTTATGGCGCCTGTACGGTCATGGCCGGGTTTGTTCTTGCAATCGCAATGATTTGA 

5184 5194 5204 5214 5224 5234 5244 
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cobE GENE (SEQ ID NO: 11) AND COBE PROTEIN (SEQ ID NO: 12) 
SEQUENCE OF THE 5396-BP Clal-Hindlll-Hindlll-Hindlll FRAGMENT 

FROM 549 TO 1010 

MetProSerGlyGlnHisSerAlaGlnThrThrLysAlaGlyAlaGlyLeuValLeuGlyLeuGlyCysGluArg 
ATGCCATCGGGCCAACACTCTGCACAGACGACGAAAGCAGGAGCCGGGCTGGTGCTCGGGCTCGGCTGCGAGCGT 

549 559 569 579 589 599 609 

ArgThrProAlaGluGluVallleAlaLeuAlaGluArgAlaLeuAlaAspAlaGlyValAlaProGlyAspLeu 
CGCACGCCGGCCGAAGAGGTGATCGCCCTTGCCGAGCGTGCGCTTGCCGATGCCGGTGTTGCGCCCGGCGATCTG 

624 634 644 654 664 674 684 

ArgLeuValAlaSerLeuAspAlaArgAlaGluGluProAlalleLeuAlaAlaAlaGlnHisPheAlaValPro 
CGGCTGGTCGCCTCGCTCGATGCTCGCGCCGAGGAGCCGGCGATCCTGGCGGCCGCTCAGCATTTCGCGGTTCCG 

699 709 719 729 739 749 759 
AlaAlaPheTyrAspAlaAlaThrLeuGluAlaGluAlaSerArgLeuAlaAsnProSerGluIleValPheAla 
GCCGCGTTCTACGATGCCGCCACGCTCGAAGCCGAAGCTTCCCGGCTCGCCAACCCGTCCGAGATCGTCTTTGCC 

774 784 794 804 814 824 834 

TyrThrGlyCysHisGlyValAlaGluGlyAlaAlaLeuValGlyAlaGlyArgGluAlaValLeuIleValGln 
TACACGGGTTGTCATGGCGTTGCCGAGGGTGCAGCGCTCGTCGGCGCCGGTCGCGAAGCCGTGCTGATTGTGCAG 

849 859 869 879 889 899 909 
LysIleValSerAlaHisAlaThrAlaAlaLeuAlaGlyProAlaThrLeuArgAlaGluLysArglleGlnAla 
AAGATCGTCTCCGCCCATGCGACGGCCGCACTTGCCGGGCCGGCGACCTTGCGCGCCGAAAAGCGCATCCAGGCG 

924 934 944 954 964 974 984 
AlaGluAlaVal*** 
GCGGAGGCTGTCTGA 

999 1009 1019 1029 1039 1049 1059 
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cobF GENE (SEQ ID NO: 13) AND COBF PROTEIN (SEQ ID NO: 14) 
SEQUENCE OF THE 8753-BP FRAGMENT FROM 736 TO 1521 

MetAlaGluAlaGlyMetArgLysIleLeuIlelleGlylleGlySerGlyAsnProGluHisMetThrValGln 
ATGGCGGAGGCGGGCATGCGCAAAATTCTGATCATCGGCATCGGTTCGGGCAATCCCGAACACATGACCGTGCAG 
736 746 756 766 776 786 796 
AlalleAsnAlaLeuAsnCysAlaAspValLeuPhelleProThrLysGlyAlaLysLysThrGluLeuAlaGlu 
GCGATCAACGCGCTGAACTGCGCCGACGTGCTCTTTATCCCGACCAAGGGAGCGAAGAAGACCGAGCTTGCCGAA 
811 821 831 841 851 861 871 
ValArgArgAspIleCysAlaArgTyrValThrArgLysAspSerArgThrValGluPheAlaValProValArg 
GTGCGCCGCGACATCTGCGCCCGCTACGTCACGCGCAAGGACAGCCGCACCGTCGAGTTCGCGGTGCCCGTGCGG 
886 896 906 916 926 936 946 
ArgThrGluGlyValSerTyrAspGlySerValAspAspTrpHisAlaGlnlleAlaGlylleTyrGluAlaLeu 
CGCACCGAAGGCGTCAGCTATGACGGCAGCGTCGATGACTGGCACGCCCAGATCGCTGGGATTTACGAAGCGCTT 
961 971 981 991 1001 1011 1021 
LeuSerLysGluLeuGlyGluGluGlyThrGlyAlaPheLeuValTrpGlyAspProMetLeuTyrAspSerThr 
CTATCGAAGGAGTTGGGCGAAGAGGGAACTGGCGCGTTTCTCGTCTGGGGCGACCCGATGCTCTATGACAGCACC 

1036 1046 1056 1066 1076 1086 1096 

IleArglleValGluArgValLysAlaArgGlyGluValAlaPheAlaTyrAspVallleProGlylleThrSer 
ATTCGCATCGTCGAGCGGGTCAAGGCACGCGGTGAGGTCGCCTTCGCCTACGACGTCATTCCCGGGATCACCAGT 

nil 1121 1131 1141 1151 1161 1171 

LeuGlnAlaLeuCysAlaSerHisArglleProLeuAsnLeuValGlyLysProValGluIleThrThrGlyArg 
CTGCAGGCGCTTTGCGCCAGCCACCGCATTCCGCTGAACCTCGTCGGCAAGCCGGTGGAGATCACCACGGGGCGT 

1186 1196 1206 1216 1226 1236 1246 

ArgLeuHisGluSerPheProGluLysSerGlnThrSerValValMetLeuAspGlyGluGlnAlaPheGlnArg 
CGGCTGCACGAAAGCTTTCCCGAGAAGAGCCAGACCTCGGTCGTCATGCTCGATGGCGAACAGGCGTTTCAGCGG 

1261 1271 1281 1291 1301 1311 1321 

ValGluAspProGluAlaGluIleTyrTrpGlyAlaTyrLeuGlyThrArgAspGluIleVallleSerGlyArg 
GTCGAGGACCCGGAGGCGGAGATCTATTGGGGCGCCTATCTCGGCACGCGGGATGAGATCGTCATTTCCGGCCGC 

1336 1346 1356 1366 1376 1386 1396 

ValAlaGluValLysAspArglleLeuGluThrArgAlaAlaAlaArgAlaLysMetGlyTrpIleMetAspIle 
GTGGCTGAGGTGAAGGACCGGATCCTTGAAACGCGGGCGGCGGCGCGCGCGAAGATGGGATGGATCATGGACATC 

1411 1421 1431 1441 1451 1461 1471 
TyrLeuLeuArgLysGlyAlaAspPheAspGlu*** 
TATCTCCTGCGCAAGGGCGCCGACTTCGACGAGTGA 

1486 1496 1506 1516 
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COBF PROTEIN 



FIRST RESIDUE = 1 
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cobG GENE (SEQ ID NO: 15) AND COBG PROTEIN (SEQ ID NO: 16) 
SEQUENCE OF THE 8753-BP FRAGMENT FROM 1620 TO 2999 

MetThrAspLeuMetThrSerCysAlaLeuProLeuThrGlyAspAlaGlyThrValAlaSerMetArgArgGly 
ATGACGGATTTGATGACCAGCTGCGCCCTTCCATTGACCGGAGATGCCGGCACCGTCGCTTCGATGCGCCGCGGC 

1620 1630 1640 1650 1660 1670 1680 

AlaCysProSerLeuAlaGluProMetGlnThrGlyAspGlyLeuLeuValArgValArgProThrAspAspSer 
GCCTGCCCGTCCTTGGCAGAGCCGATGCAGACCGGCGACGGCCTGCTCGTGAGGGTGAGGCCAACGGATGACAGC 

1695 1705 1715 1725 1735 1745 1755 

LeuThrLeuProLysVallleAlaLeuAlaThrAlaAlaGluArgPheGlyAsnGlyllelleGluIleThrAla 
CTGACGCTGCCGAAGGTCATTGCCCTTGCCACGGCTGCCGAGCGCTTCGGCAATGGCATCATCGAGATTACCGCG 

1770 1780 1790 1800 1810 1820 1830 

ArgGlyAsnLeuGlnLeuArgGlyLeuSerAlaAlaSerValProArgLeuAlaGlnAlalleGlyAspAlaGlu 
CGCGGAAACCTGCAGCTTCGCGGCCTGAGCGCGGCTTCGGTGCCAAGGCTGGCGCAGGCGATCGGCGATGCGGAG 

1845 1855 1865 1875 1885 1895 1905 

IleAlalleAlaGluGlyLeuAlalleGluValProProLeuAlaGlylleAspProAspGluIleAlaAspPro 
ATCGCCATTGCCGAGGGGCTCGCGATCGAGGTGCCGCCCCTGGCCGGCATCGACCCGGACGAGATCGCCGATCCG 

1920 1930 1940 1950 1960 1970 1980 

ArgProIleAlaThrGluLeuArgGluAlaLeuAspValArgGlnValProLeuLysLeuAlaProLysLeuSer 
CGGCCGATTGCCACTGAGCTTCGTGAAGCGTTGGATGTGCGCCAGGTGCCGTTGAAGCTTGCACCCAAATTATCC 

1995 2005 2015 2025 2035 2045 2055 

ValVallleAspSerGlyGlyArgPheGlyLeuGlyAlaValValAlaAspIleArgLeuGlnAlaValSerThr 
GTCGTCATCGATAGCGGTGGCCGGTTTGGTCTCGGCGCTGTCGTCGCCGACATTCGCCTTCAGGCGGTTTCGACT 

2070 2080 2090 2100 2110 2120 2130 

ValAlaGlyValAlaTrpValLeuSerLeuGlyGlyThrSerThrLysAlaSerSerValGlyThrLeuAlaGly 
GTCGCGGGGGTGGCCTGGGTGCTGTCGCTTGGCGGCACGTCAACGAAGGCATCGAGCGTCGGGACGTTGGCCGGC 

2145 2155 2165 2175 2185 2195 2205 

AsnAlaValValProAlaLeuIleThrlleLeuGluLysLeuAlaSerLeuGlyThrThrMetArgGlyArgAsp 
AACGCGGTCGTGCCGGCCCTGATCACCATTCTCGAGAAACTGGCGAGCCTGGGCACGACGATGCGCGGGCGCGAT 

2220 2230 2240 2250 2260 2270 2280 

LeuAspProSerGluIleArgAlaLeuCysArgCysGluThrSerSerGluArgProAlaAlaProArgSerAla 
CTGGACCCGTCGGAAATCCGCGCGCTCTGTCGCTGTGAGACATCGTCCGAACGCCCGGCCGCTCCGCGTTCGGCC 

2295 2305 2315 2325 2335 2345 2355 

AlalleProGlylleHisAlaLeuGlyAsnAlaAspThrValLeuGlyLeuGlyLeuAlaPheAlaGlnValGlu 
GCAATACCCGGCATTCATGCGCTGGGTAACGCCGACACCGTTCTCGGCCTCGGTCTGGCCTTTGCTCAGGTGGAG 

2370 2380 2390 2400 2410 2420 2430 

AlaAlaAlaLeuAlaSerTyrLeuHisGlnValGlnAlaLeuGlyAlaAsnAlalleArgLeuAlaProGlyHis 
GCCGCCGCGCTGGCATCCTACCTGCATCAGGTCCAGGCGCTTGGCGCCAATGCGATCCGGCTTGCGCCCGGGCAC 

2445 2455 2465 2475 2485 2495 2505 
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AlaPhePheValLeuGlyLeuCysProGluThrAlaAlaValAlaGlnSerLeuAlaAlaSerHisGlyPheArg 
GCCTTCTTCGTCCTCGGCCTTTGCCCCGAGACCGCGGCTGTGGCGCAGAGCCTGGCAGCGTCACACGGTTTTCGC 

2520 2530 2540 2550 2560 2570 2580 

IleAlaGluGlnAspProArgAsnAlalleAlaThrCysAlaGlySerLysGlyCysAlaSerAlaTrpMetGlu 
ATTGCCGAGCAGGATCCGCGCAATGCGATCGCCACCTGCGCCGGCAGCAAGGGTTGCGCCTCGGCGTGGATGGAA 

2595 2605 2615 2625 2635 2645 2655 

ThrLysGlyMetAlaGluArgLeuValGluThrAlaProGluLeuLeuAspGlySerLeuThrValHisLeuSer 
ACCAAGGGCATGGCCGAGCGCCTCGTCGAGACGGCGCCGGAATTGCTCGACGGGTCGCTCACCGTGCATCTCTCC 

2670 2680 2690 2700 2710 2720 2730 

GlyCysAlaLysGlyCysAlaArgProLysProSerGluLeuThrLeuValGlyAlaProSerGlyTyrGlyLeu 
GGCTGCGCCAAGGGCTGCGCCCGGCCGAAGCCGTCCGAACTGACGCTTGTCGGTGCGCCATCAGGATACGGGCTT 

2745 2755 2765 2775 2785 2795 2805 

ValValAsnGlyAlaAlaAsnGlyLeuProSerAlaTyrThrAspGluAsnGlyMetGlySerAlaLeuAlaArg 
GTCGTAAATGGGGCTGCCAATGGCTTGCCAAGCGCCTACACCGATGAGAATGGAATGGGATCCGCCCTTGCCCGG 

2820 2830 2840 2850 2860 2870 2880 

LeuGlyArgLeuValArgGlnAsnLysAspAlaGlyGluSerAlaGlnSerCysLeuThrArgLeuGlyAlaAla 
CTCGGCCGGCTGGTGCGGCAAAACAAAGACGCTGGCGAATCGGCGCAGTCCTGTCTTACACGGCTCGGAGCTGCG 

2895 2905 2915 2925 2935 2945 2955 
ArgValSerAlaAlaPheGluGlnGly*** 
CGCGTCTCGGCAGCGTTCGAACAGGGATAG 

2970 2980 2990 3000 
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COBG PROTEIN 



FIRST RESIDUE = 1 
LAST RESIDUE = 459 









NUMBER 


NO. % 


WEIGHT 


' WEIGHT 


1 


PHE 


F 


7 


1 53 


10?9 49 


7 91 


2 


LEU 


L 


56 


12 20 


6332 4ft 


X «J • oJ / 


3 


ILE 

J- XJ 1—1 


I 


21 


4 58 


9^1A 6ft 


5 OQ 




MET 


M 


8 


1 74 


in4ft 32 


2 25 


5 


VAL 


V 


31 


6 75 


3071 17 


6 5ft 


6 


SER 


s 


32 


6 97 


27ft4 96 


5 Q7 


1 


PRO 


p 


26 




2523 30 


5 4 1 


8 


THR 


T 

X 


27 


5 ftft 


272ft 35 


5 ft 5 


9 


AT.A 


A 


7ft 




5541 12 


1 1 ftft 


1 0 

J. V 


TYR 


Y 




n s 5 


4RQ 1ft 


1 05 


1 1 
X J. 


* 


* 


n 


n nn 


0 00 


n nn 


12 


HIS 


H 


5 


1 09 


685 30 


1 47 

X • ^ / 


13 


GLN 


Q 


13 


2.83 


1664.78 


3.57 


14 


ASN 


N 


10 


2.18 


1140.40 


2.44 


15 


LYS 


K 


10 


2.18 


1280.90 


2.75 


16 


ASP 


D 


19 


4.14 


2185.57 


4.68 


17 


GLU 


E 


24 


5.23 


3096.96 


6.64 


18 


CYS 


C 


10 


2.18 


1030.10 


2.21 


19 


TRP 


W 


2 


0.44 


372.16 


0.80 


20 


ARG 


R 


29 


6.32 


4526.90 


9.70 


21 


GLY 


G 


48 • 


10.46 


2736.96 


5.87 


22 






0 


0.00 


0.00 


0.00 



1.80 



RESIDUES 

MOLECULAR WEIGHT 

INDEX OF POLARITY (%) 

ISOELECTRIC POINT 

OD 260 (Img/ml) = 0.215 OD 280 

COBG FROM 1 TO 459 



459 
46661. 
37. 
6.41 

(Img/ml) = 0.315 




-1.80 



45 90 135 180 225 270 315 360 405 450 

FIG. 16E 
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cobH GENE (SEQ ID NO: 17) AND COBH PROTEIN (SEQ ID NO: 18) 
SEQUENCE OF THE 8753-BP FRAGMENT FROM 3002 TO 3634 

MetProGluTyrAspTyrlleArgAspGlyAsnAlalleTyrGluArgSerPheAlallelleArgAlaGluAla 
ATGCCTGAGTATGATTACATTCGCGATGGCAACGCCATCTACGAGCGTTCCTTCGCCATCATCCGCGCCGAGGCC 
3002 3012 3022 3032 3042 3052 3062 

AspLeuSerArgPheSerGluGluGluAlaAspLeuAlaValArgMetValHisAlaCysGlySerValGluAla 
GATCTGTCGCGCTTCTCCGAAGAGGAAGCGGATCTGGCTGTGCGCATGGTGCACGCCTGCGGTTCCGTCGAGGCG 
3077 3087 3097 3107 3117 3127 3137 

ThrArgGlnPheValPheSerProAspPheValSerSerAlaArgAlaAlaLeuLysAlaGlyAlaProIleLeu 
ACCAGGCAGTTCGTGTTTTCTCCCGATTTCGTAAGCTCGGCCCGTGCGGCGCTGAAAGCCGGTGCGCCGATCCTC 
3152 3162 3172 3182 3192 3202 3212 

CysAspAlaGluMetValAlaHisGlyValThrArgAlaArgLeuProAlaGlyAsnGluVallleCysThrLeu 
TGCGATGCCGAGATGGTTGCGCACGGTGTCACCCGCGCCCGTCTGCCGGCCGGCAACGAGGTGATCTGCACGCTG 
3227 3237 3247 3257 3267 3277 3287 

ArgAspProArgThrProAlaLeuAlaAlaGluIleGlyAsnThrArgSerAlaAlaAlaLeuLysLeuTrpSer 
CGCGATCCTCGCACGCCCGCACTTGCGGCCGAGATCGGCAACACCCGCTCCGCCGCAGCCCTGAAGCTCTGGAGC 
3302 3312 3322 3332 3342 3352 3362 

GluArgLeuAlaGlySerValValAlalleGlyAsnAlaProThrAlaLeuPhePheLeuLeuGluMetLeuArg 
GAGCGGCTGGCCGGTTCGGTGGTCGCGATCGGCAACGCGCCGACGGCGTTGTTCTTCCTCTTGGAAATGCTGCGC 
3377 3387 3397 3407 3417 3427 3437 

AspGlyAlaProLysProAlaAlalleLeuGlyMetProValGlyPheValGlyAlaAlaGluSerLysAspAla 
GACGGCGCGCCGAAGCCGGCGGCAATCCTCGGCATGCCCGTCGGTTTCGTCGGTGCGGCGGAATCGAAGGATGCG 
3452 3462 3472 3482 3492 3502 3512 

LeuAlaGluAsnSerTyrGlyValProPheAlalleValArgGlyArgLeuGlyGlySerAlaMetThrAlaAla 

CTGGCCGAGAACTCCTATGGCGTTCCCTTCGCCATCGTGCGCGGCCGCCTCGGCGGGAGTGCCATGACGGCGGCA 
3527 3537 3547 3557 3567 3577 3587 

AlaLeuAsnSerLeuAlaArgProGlyLeu * * * 

GCGCTTAACTCGCTCGCGAGGCCGGGCCTGTGA 
3602 3612 3622 3632 
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COBH PROTEIN 



FIRST RESIDUE = 1 
LAST RESIDUE = 210 





NUMBER 


NO. % 


WEIGHT 


WEIGHT % 


1 PHE 


TP 

c 


Q 


4.29 


1323 63 


6 00 


2 LEU 


T 

L 


Zu 


9.52 


2261 60 


10 26 


3 ILE 


T 
1 


1 A 

iU 


4.76 


1130 80 


5 13 


4 MET 




b 


2 86 


786 24 


3 S7 


5 VAL 


V 


1 A 
IH 


6 67 


1386 98 


6 2Q 


6 SER 


c 
o 


1 A 


6 67 


1218 42 


S S3 


7 PRO 


ID 

r 


1 O 

LZ 


5 71 


1164 60 


S 2ft 


8 THR 


i 


1 


3 33 


707 3S 


3 21 


9 ALA 


TV 
A 


A n 


19 05 


2841 60 

^ U ^ X • \J\J 


1 2 fiQ 


10 TYR 


Y 


A 

4 


1 90 


6^2 24 


2 Qfi 


11 * 


★ 


0 


0 00 


0 00 


0 no 

\J m \J\J 


12 HIS 


H 


2 


0.95 


274.12 


1.24 


13 GLN 


Q 


1 


0.48 


128.06 


0.58 


14 ASN 


N 


c 
D 


2 .86 


684 24 


3 10 

^ • X w 


15 LYS 


K 


4 


1 . 90 


512!36 


2.32 


16 ASP 


D 


n 

y 


4 29 


1035.27 


4.70 


17 GLU 


E 


14 


6 67 


1806.56 


8.19 


18 CYS 


C 


3 


1.43 


309.03 


1.40 


19 TRP 


W 


1 


0.48 


186.08 


0.84 


20 ARC 


R 


17 


8.10 


2653.70 


12.03 


21 GLY 


G 


17 


8.10 


969.34 


4.40 


22 




0 


0.00 


0.00 


0.00 


RESIDUES 








210 


MOLECULAR WEIGHT 






= 22050. 


INDEX 


OF POLARITY 


(%) 






35. 


ISOELECTRIC POINT 






6.22 


OD 260 


(Img/ml) = 


0.291 OD 280 


(Img/ml) = 0 


.467 




COBH FROM 


1 TO 210 




1.80 1 














-1.80 



21 42 63 84 105 126 147 168 189 210 

FIG. 16G 
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cobi GENE (SEQ ID NO: 19) AND GOBI PROTEIN (SEQ ID NO: 20) 
SEQUENGE OF THE 8753-BP FRAGMENT FROM 3631 TO 4368 

MetSerGlyValGlyValGlyArgLeuIleGlyValGlyThrGlyProGlyAspProGluLeuLeuThrValLys 
GTGAGCGGCGTCGGCGTGGGGCGCCTGATCGGTGTTGGGACCGGCCCCGGTGATCCGGAACTTTTGACGGTCAAG 
3631 3641 3651 3661 3671 3681 3691 

AlaValLysAlaLeuGlyGlnAlaAspValLeuAlaTyrPheAlaLysAlaGlyArgSerGlyAsnGlyArgAla 
GCGGTGAAGGCGCTCGGGCAAGCCGATGTGCTTGCCTATTTCGCCAAGGCCGGGCGAAGCGGTAACGGCCGCGCG 

3706 3716 3726 3736 3746 3756 3766 

ValValGluGlyLeuLeuLysProAspLeuValGluLeuProLeuTyrTyrProValThrThrGluIleAspLys 
GTGGTCGAGGGTCTGCTGAAGCCCGATCTTGTCGAGCTGCCGCTATACTATCCGGTGACGACCGAAATCGACAAG 

3781 3791 3801 3811 3821 3831 3841 

AspAspGlyAlaTyrLysThrGlnlleThrAspPheTyrAsnAlaSerAlaGluAlaValAlaAlaHisLeuAla 
GACGATGGCGCCTACAAGACCCAGATCACCGACTTCTACAATGCGTCGGCCGAAGCGGTAGCGGCGCATCTTGCC 

3856 3866 3876 3886 3896 3906 3916 

AlaGlyArgThrValAlaValLeuSerGluGlyAspProLeuPheTyrGlySerTyrMetHisLeuHisValArg 
GCCGGGCGCACGGTCGCCGTGCTCAGTGAAGGCGACCCGCTGTTCTATGGTTCCTACATGCATCTGCATGTGCGG 

3931 3941 3951 3961 3971 3981 3991 

LeuAlaAsnArgPheProValGluVallleProGlylleThrAlaMetSerGlyCysTrpSerLeuAlaGlyLeu 
CTCGCCAATCGTTTCCCGGTCGAGGTGATCCCCGGCATTACCGCCATGTCCGGCTGTTGGTCGCTTGCCGGCCTG 

4006 4016 4026 4036 4046 4056 4066 

ProLeuValGlnGlyAspAspValLeuSerValLeuProGlyThrMetAlaGluAlaGluLeuGlyArgArgLeu 
CCGCTGGTGCAGGGCGACGACGTGCTCTCGGTGCTTCCGGGCACCATGGCCGAGGCCGAGCTCGGCCGCAGGCTT 

4081 4091 4101 4111 4121 4131 4141 

AlaAspThrGluAlaAlaVallleMetLysValGlyArgAsnLeuProLysIleArgArgAlaLeuAlaAlaSer 
GCGGATACCGAAGCCGCCGTGATCATGAAGGTCGGGCGCAATTTGCCGAAGATCCGTCGGGCGCTCGCTGCCTCC 

4156 4166 4176 4186 4196 4206 4216 

GlyArgLeuAspGlnAlaValTyrValGluArgGlyThrMetLysAsnAlaAlaMetThrAlaLeuAlaGluLys 
GGCCGTCTCGACCAGGCCGTCTATGTCGAACGCGGCACGATGAAGAACGCGGCGATGACGGCTCTTGCGGAAAAG 

4231 4241 4251 4261 4271 4281 4291 
AlaAspAspGluAlaProTyrPheSerLeuValLeuValProGlyTrpLysAspArgPro*** 
GCCGACGACGAGGCGCCCTATTTCTCGCTGGTGCTCGTTCCCGGCTGGAAGGACCGACCATGA - 

4306 . 4316 4326 4336 4346 4356 4366 
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GOBI PROTEIN 



FIRST RESIDUE = 1 
LAST RESIDUE = 245 







NUMBER 


NO. % 


WEIGHT 


WEIGHT % 


1 


PHE 


c 


R 


2.04 


735.35 


2.84 


2 


LEU 


T, 


9ft 


11.43 


3166.24 


12.24 


3 


ILE 


T 
J. 


/ 


2.86 


791.56 


3.06 


4 


MET 


M 


7 


2.86 


917.28 


3.54 


5 


VAL 


V 




10.20 


2476.75 


9 .57 


6 


SER 


c 


1 u 


4.08 


870.30 


3.36 


7 


PRO 


P 


J. 4 


5.71 


1358.70 


5.25 


8 


THR 


1 


1 0 


4 . 90 


1212.60 


4 . 69 


9 


ALA 


a 
K 


"5/1 


13. 88 


2415.36 


9 33 


10 


TYR 


V 


Q 


3.67 


1467 54 


5 67 


11 


* 


* 


0 


0.00 


0 00 


0 00 


12 


HIS 


H 


3 


1.22 


411.18 


1.59 


13 


GLN 


Q 


/I 
H 


1. 63 


512 .24 


1.98 


14 


ASN 


N 


3 


2 04 


570 20 


2 20 


15 


LYS 


K 


11 


4 49 


1408.99 


5.44 


16 


ASP 


D 




6 12 


1725.45 


6.67 


17 


GLU 


E 


13 


5 31 


1677.52 


6.48 


18 


CYS 


C 


1 


0.41 


103.01 


0.40 


19 


TRP 


W 


2 


0 . 82 


372.16 


1.44 


20 


ARG 


R 


14 


5.71 


2185.40 


8.44 


21 


GLY 


G 


26 


10.61 


1482.52 


5.73 


22 






0 


0.00 


0.00 


0.00 


RESIDUES 








245 


MOLECULAR WEIGHT 






= 25878. 


INDEX 


OF POLARITY 


(%) 






36. 


ISOELECTRIC POINT 






6.17 


OD 


260 


(1 mg/ml) ■ 


= 0.512 


OD 280 


(1 mg/ml) = 


0.843 








COB I FROM 1 TO 


245 




0 1 n 
















-2.10 



24 48 72 96 120 144 168 192 216 240 



FIG. 161 
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CObJ GENE (SEQ ID NO: 21) AND COBJ PROTEIN (SEQ ID NO: 22) 
SEQUENCE OF THE 8753-BP FRAGMENT FROM 4365 TO 5129. 

MetThrGlyThrLeuTyrValValGlyThrGlyProGlySerAlaLysGlnMetThrProGluThrAlaGluAla 
ATGACCGGTACGCTCTATGTCGTCGGTACCGGACCGGGCAGCGCCAAGCAGATGACGCCGGAAACGGCGGAAGCC 

4365 4375 4385 4395 4405 4415 4425 

ValAlaAlaAlaGlnGluPheTyrGlyTyrPheProTyrLeuAspArgLeuAsnLeuArgProAspGlnlleArg 
GTTGCGGCCGCTCAGGAGTTTTACGGCTACTTTCCCTATCTCGACCGGCTGAACCTCAGACCGGATCAGATCCGT 

4440 4450 4460 4470 4480 4490 4500 

ValAlaSerAspAsnArgGluGluLeuAspArgAlaGlnValAlaLeuThrArgAlaAlaAlaGlyValLysVal 
GTCGCCTCGGACAACCGCGAGGAGCTCGATCGGGCACAGGTCGCGCTGACGCGGGCTGCGGCAGGCGTGAAGGTC 

4515 4525 4535 4545 4555 4565 4575 

CysMetValSerGlyGlyAspProGlyValPheAlaMetAlaAlaAlaValCysGluAlalleAspLysGlyPro 
TGCATGGTCTCCGGTGGCGATCCCGGTGTCTTTGCCATGGCGGCCGCCGTCTGCGAGGCGATCGACAAGGGACCG 

4590 4600 4610 4620 4630 4640 4650 

AlaGluTrpLysSerValGluLeuVallleThrProGlyValThrAlaMetLeuAlaValAlaAlaArglleGly 
GCGGAATGGAAGTCGGTTGAACTGGTGATCACGCCCGGCGTGACCGCGATGCTCGCCGTTGCCGCCCGCATCGGC 

4665 4675 4685 4695 4705 4715 4725 

AlaProLeuGlyHisAspPheCysAlalleSerLeuSerAspAsnLeuLysProTrpGluVallleThrArgArg 
GCGCCGCTCGGTCATGATTTCTGTGCGATCTCGCTTTCCGACAATCTGAAGCCCTGGGAAGTCATCACCCGGCGT 

4740 4750 4760 4770 4780 4790 4800 

LeuArgLeuAlaAlaGluAlaGlyPheVallleAlaLeuTyrAsnProIleSerLysAlaArgProTrpGlnLeu 
CTCAGGCTGGCGGCGGAAGCGGGCTTCGTCATTGCCCTCTACAATCCGATCAGCAAGGCGCGGCCCTGGCAGCTC 

4815 4825 4835 4845 4855 4865 4875 

GlyGluAlaPheGluLeuLeuArgSerValLeuProAlaSerValProValllePheGlyArgAlaAlaGlyArg 
GGTGAGGCCTTCGAGCTTCTGCGCAGCGTTCTGCCGGCAAGCGTTCCGGTCATCTTCGGCCGTGCGGCCGGGCGG 

4890 4900 4910 4920 4930 4940 4950 

ProAspGluArglleAlaValMetProLeuGlyGluAlaAspAlaAsnArgAlaAspMetAlaThrCysVallle 
CCGGACGAACGGATCGCGGTGATGCCGCTCGGCGAGGCCGATGCCAACCGCGCCGACATGGCGACCTGCGTCATC 

4965 4975 4985 4995 5005 5015 5025 

IleGlySerProGluThrArglleValGluArgAspGlyGlnProAspLeuValTyrThrProArgPheTyrAla 
ATCGGCTCGCCGGAGACGCGCATCGTCGAGCGCGACGGCCAACCCGATCTCGTCTACACACCGCGCTTCTATGCA 

5040 5050 5060 5070 5080 5090 5100 
GlyAlaSerGln*** 
GGGGCGAGCCAGTGA 

5115 5125 



FIG. 16J 
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COBJ PROTEIN 



FIRST RESIDUE = 1 
LAST RESIDUE = 254 







NUMBER 


NU . -6 


WEIGHT 


WEIGHT 


1 


PHE 


F 


8 


3.15 


1176.56 


4.34 


2 


LEU 


L 


20 


7.87 


2261.60 


8.35 


3 


ILE 


I 


13 


5.12 


1470.04 


5.43 


4 


MET 


M 


7 


2.76 


917.28 


3.39 


5 


VAL 


V 


23 


9.06 


2278.61 


8.41 


6 


SER 


s 


11 


4.33 


957.33 


3.53 


7 


PRO 


P 


18 


7.09 


1746.90 


6.45 


8 


THR 


T 


12 


4.72 


1212.60 


4.48 


9 


ALA 


A 


40 


15.75 


2841.60 


10.49 


10 


TYR 


Y 


7 


2.76 


1141.42 


4.21 


11 


* 


* 


0 


0.00 


0.00 


0.00 


12 


HIS 


H 


1 


0.39 


137.06 


0.51 


1 J 




Q 


7 




one /I o 


3.31 


14 


ASN 


N 


5 


1.97 


570.20 


2.11 


15 


LYS 


K 


6 


2.36 


768.54 


2.84 


16 


ASP 


D 




5.12 


1495.39 


5.52 


17 


GLU 


E 


16 


6.30 


2064.64 


7.62 


18 


CYS 


C 


4 


1.57 


412.04 


1.52 


19 


TRP 


W 


3 


1 . 18 


558.24 


2.06 


20 


ARG 


R 


19 


7.48 


2965.90 


10.95 


21 


GLY 


G 


21 


8.27 


1197,42 


4.42 


22 






0 


0.00 


0.00 


0.00 




RESIDUES 








254 




MOLECULAR WEIGHT 








27088. 




INDEX 


OF POLARITY 


(%) 






35. 




ISOELECTRIC POINT 






5.43 




OD 260 


(1 mg/ml) 


= 0.575 


OD 260 


(1 mg/ml) = 


= 0.922 



COBJ FROM 1 TO 254 




-2.10 



25 50 75 100 125 150 175 200 225 250 



FIG. 16K 
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cobK GENE (SEQ ID NO: 23) AND COBK PROTEIN (SEQ ID NO: 24) 
SEQUENCE OF THE 8753-BP ECORI-ECORI FRAGMENT FROM 2861 TO .3646 ON 

THE COMPLEMENTARY STRAND 

MAGSLFDTSAMEKPRILILGGTTEA 
ATGGCGGGTTCGCTGTTCGACACGTCAGCCATGGAAAAACCTCGTATTCTGATTCTGGGTGGCACCACCGAGGCA 
2861 2871 2881 2891 2901 2911 2921 2931 
RELARRLAEDVRYDTAISLAGRTAD 
CGCGAACTCGCGCGCCGCTTGGCCGAAGATGTCCGCTACGACACCGCCATCTCGCTGGCCGGCCGCACCGCGGAC 
2936 2946 2956 2966 2976 2986 2996 3006 
PRPQPVKTRIGGFGGADGLAHFVHD 
CCGCGGCCGCAGCCGGTCAAGACGCGCATCGGCGGCTTTGGCGGCGCCGATGGGCTGGCGCATTTCGTGCATGAC 
3011 3021 3031 3041 3051 3061 3071 3081 
ENIALLVDATHPFAARISHNAADAA 
GAAAACATCGCGCTGCTGGTCGATGCGACGCACCCCTTTGCCGCACGCATTTCGCACAACGCCGCGGACGCAGCG 
3086 3096 3106 3116 3126 3136 3146 3156 
QRTGVALIALRRPEWVPLPGDRWTA 
CAAAGAACCGGCGTTGCGCTTATCGCCCTCCGCCGACCGGAATGGGTGCCCCTGCCTGGCGACCGCTGGACTGCT 
3161 3171 3181 3191 3201 3211 3221 3231 
VDSVVEAVSALGDRRRRVFLAIGRQ 
GTCGATAGCGTTGTCGAGGCCGTCAGCGCGCTCGGTGATCGGCGACGCCGCGTCTTCCTGGCGATAGGTCGACAG 
3236 3246 3256 3266 3276 3286 3296 3306 
EAFHFEVAPQHSYVIRSVDPVTPPL 
GAAGCTTTCCACTTCGAGGTCGCGCCGCAGCACAGCTACGTCATCCGCAGCGTCGATCCGGTGACGCCGCCGCTT 
3311 3321 3331 3341 3351 3361 3371 3381 
NLPDQEAILATGPFAEADEAALLRS 
AATCTGCCCGACCAGGAGGCGATCCTGGCGACCGGTCCCTTTGCGGAAGCCGACGAAGCCGCGTTGCTCAGGAGT 
3386 3396 -3406 3416 3426 3436 3446 3456 
RQIDVIVAKNSGGSATYGKIAAARR 
CGGCAGATCGATGTGATCGTCGCCAAGAACAGCGGTGGCAGCGCCACCTACGGCAAGATTGCCGCAGCGCGCCGG 
3461 3471 3481 3491 3501 3511 3521 3531 
LGIEVIMVERRKPADVPTVGSCDEA 
CTCGGCATCGAGGTGATCATGGTCGAGCGGCGCAAGCCCGCGGACGTGCCGACGGTCGGCAGTTGCGACGAGGCA 
3536 3546 3556 3566 3576 3586 3596 3606 
LNRIAHWLAPA 
CTCAACCGCATCGCTCACTGGCTCGCCCCTGCATGA 
3611 3621 3631 3641 



FIG. 16L 
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NAME = COBK 



FIRST RESIDUE = 1 
LAST RESIDUE = 261 









NUMBER 


NO. % 


WEIGHT 


WEIGHT 


1 

X 


PHF. 


F 

r 


Q 
o 


^ 07 


1 1 7 fi RS 

X X / D • J ^ 


4 10 
4 . X y 


9 


J-il-j u 


T. 




O • 1 o 


94ft7 PS 


P P^ 




TT.F 


T 
X 


X u 


U • X o 


1 ftflQ "^4 

X 0 U . 0 4 


0.44 


4 


MFT 


M 




1 1 S 

X • X J 


19 


1 4 0 
1 . 4 U 




VAT. 


V 


^ X 


O • U J 


onftf) 44 

/I. U O U . 4 4 


7 41 
/ . 4 X 


u 


O Ej fx 


q 


X 


4 fin 


104 4 '^P 
X U 4 4 . JO 


0 . / Z 


7 


PRO 

XT IWJ 


p 

XT 


X / 


fi SI 
D . O X 


1 fi4 Q Qfl 
X D 4 ^ . ^ U 


S PP 
0 . OO 


Q 
o 


1 nrs. 


T 
i 


X J 


4 QP 


1 -^1 Q an 


4 ^P 
4 . DO 


Q 


AT. A 


ri 




1 fi HQ 
X D . U y 


Z y O J • J D 




1 n 


TYR 


Y 




1 IS 
X • xo 


4PQ 1 Q 


1 . / 4 


1 1 

± JL 


★ 


★ 


n 
u 


u • u u 


u . uu 


u . uu 


12 


HIS 


H 


7 


? fi8 


zf >j y • T X 


49 


13 


GLN 


Q 


6 


2.30 


768.35 


2.74 


14 


ASN 


N 


5 


1.92 


570.21 


2.03 


15 


LYS 


K 


5 


1.92 


640.47 


2.28 


16 


ASP 


D 


17 


6.51 


1955,46 


6.96 


17 


GLU 


E 


15 


5.75 


1935.64 


6.89 


18 


CYS 


C 


1 


0,38 


103.01 


0.37 


19 


TRP 


W 


3 


1.15 


558.24 


1.99 


20 


ARC 


R 


26 


9.96 


4058.63 


14.45 


21 


GLY 


G 


19 


7.28 


1083.41 


3.86 


22 






0 


0.00 


0.00 


0.00 



2.10 



-2.10 



RESIDUES 

MOLECULAR WEIGHT (MONOISOTOPIC) 
MOLECULAR WEIGHT (AVERAGE) 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 



261 

= 28078.7988 
= 28096.0195 
40.64 
7.54 



OD 260 (Img/ml) = 0.509 OD 280 (Img/ml) = 0.721 

HYDROPHILICITY PROFILE OF THE COBK PROTIEN 
FROM 1 TO 261 




26 52 78 104 130 156 182 208 234 260 
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cobL GENE (SEQ ID NO: 25) AND COBL PROTEIN (SEQ ID NO: 26) 
SEQUENCE OF THE 8753-BP FRAGMENT FROM 5862 TO 7103 

MetAlaAspValSerAsnSerGluProAlalleValSerProTrpLeuThrVallleGlylleGlyGluAspGly 
ATGGCTGACGTGTCGAACAGCGAACCCGCCATAGTCTCCCCCTGGCTGACCGTCATCGGTATCGGTGAGGATGGT 

5862 5872 5882 5892 5902 5912 5922 

ValAlaGlyLeuGlyAspGluAlaLysArgLeuIleAlaGluAlaProValValTyrGlyGlyHisArgHisLeu 
GTAGCGGGTCTCGGCGACGAGGCCAAGCGGCTGATCGCCGAAGCGCCGGTCGTCTACGGCGGCCATCGTCATCTG 

5937 5947 5957 5967 5977 5987 5997 

GluLeuAlaAlaSerLeuIleThrGlyGluAlaHisAsnTrpLeuSerProLeuGluArgSerValValGluIle 
GAGCTCGCCGCCTCCCTCATCACCGGCGAAGCGCACAATTGGCTAAGCCCCCTCGAACGCTCGGTCGTCGAGATC 

6012 6022 6032 6042 6052 6062 6072 

ValAlaArgArgGlySerProValValValLeuAlaSerGlyAspProPhePhePheGlyValGlyValThrLeu 
GTCGCGCGTCGCGGCAGCCCGGTGGTGGTGCTTGCCTCGGGCGACCCGTTCTTCTTCGGCGTCGGCGTGACGCTG 

6087 6097 6107 6117 6127 6137 6147 

AlaArgArglleAlaSerAlaGluIleArgThrLeuProAlaProSerSerlleSerLeuAlaAlaSerArgLeu 
GCGCGCCGCATCGCCTCGGCCGAAATACGCACGCTTCCGGCGCCGTCGTCGATCAGTCTTGCCGCCTCGCGCCTC 

6162 6172 6182 6192 6202 6212 6222 

GlyTrpAlaLeuGlnAspAlaThrLeuValSerValHisGlyArgProLeuAspLeuValArgProHisLeuHis 
GGCTGGGCGCTGCAGGATGCGACGCTCGTCTCCGTACATGGGCGGCCGCTGGATCTGGTGCGACCGCATTTGCAT 

6237 6247 6257 6267 .6277 6287 6297 

ProGlyAlaArgValLeuThrLeuThrSerAspGlyAlaGlyProArgAspLeuAlaGluLeuLeuValSerSer 
CCGGGGGCGCGTGTGCTTACGCTCACGTCGGACGGTGCGGGTCCGCGAGACCTTGCCGAGCTTCTGGTTTCAAGC 

6312 6322 .6332 6342 6352 6362 6372 

GlyPheGlyGlnSerArgLeuThrValLeuGluAlaLeuGlyGlyAlaGlyGluArgValThrThrGlnlleAla 
GGCTTCGGTCAGTCGCGACTGACCGTGCTCGAAGCGCTGGGCGGCGCCGGCGAACGGGTGACGACGCAGATCGCC 

6387 6397 6407 6417 6427 6437 6447 

AlaArgPheMetLeuGlyLeuValHisProLeuAsnValCysAlalleGluValAlaAlaAspGluGlyAlaArg 
GCGCGCTTCATGCTCGGCCTCGTGCATCCTTTGAACGTCTGCGCCATTGAGGTGGCGGCCGACGAGGGCGCGCGC 

6462 6472 6482 6492 6502 6512 6522 

IleLeuProLeuAlaAlaGlyArgAspAspAlaLeuPheGluHisAspGlyGlnlleThrLysArgGluValArg 
ATCCTGCCGCTTGCCGCCGGCCGCGACGATGCGCTGTTCGAACATGACGGGCAGATCACCAAGCGGGAGGTGCGG 

6537 6547 6557 6567 6577 6587 6597 

AlaLeuThrLeuSerAlaLeuAlaProArgLysGlyGluLeuLeuTrpAspIleGlyGlyGlySerGlySerlle 
GCGCTGACGCTGTCGGCACTCGCACCGCGCAAGGGCGAACTGCTATGGGACATCGGCGGCGGCTCCGGCTCGATC 

6612 6622 6632 6642 6652 6662 6672 

GlylleGluTrpMetLeuAlaAspProThrMetGlnAlalleThrlleGluValGluProGluArgAlaAlaArg 
GGCATCGAATGGATGCTCGCCGATCCGACCATGCAGGCGATCACCATCGAGGTTGAGCCGGAGCGGGCAGCGCGC 

6687 6697 6707 6717 6727 6737 6747 
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IleGlyArgAsnAlaThrMetPheGlyValProGlyLeuThrValValGluGlyGluAlaProAlaAlaLeuAla 
ATCGGCCGCAACGCGACGATGTTCGGCGTGCCCGGGCTGACGGTTGTCGAAGGCGAGGCGCCGGCGGCGCTTGCC 
6762 6772 6782 6792 6802 6812 6822 

GlyLeuProGlnProAspAlallePhelleGlyGlyGlyGlySerGluAspGlyValMetGluAlaAlalleGlu 
GGCCTGCCACAACCGGACGCGATCTTCATCGGCGGCGGCGGCAGCGAAGACGGCGTCATGGAAGCAGCGATCGAG 
6837 6847 6857 6867 6877 6887 6897 

AlaLeuLysSerGlyGlyArgLeuValAlaAsnAlaValThrThrAspMetGluAlaValLeuLeuAspHisHis 
GCGCTCAAGTCAGGCGGACGGCTGGTTGCCAACGCGGTGACGACGGACATGGAAGCGGTGCTGCTCGATCATCAC 

6912 6922 6932 6942 6952 6962 6972 

AlaArgLeuGlyGlySerLeuIleArglleAspIleAlaArgAlaGlyProIleGlyGlyMetThrGlyTrpLys 
GCGCGGCTCGGCGGTTCGCTGATCCGCATCGATATCGCGCGTGCAGGACCCATCGGCGGCATGACCGGCTGGAAG 

6987 6997 7007 7017 7027 7037 7047 
ProAlaMetProValThrGlnTrpSerTrpThrLysGly*** 
CCGGCCATGCCGGTCACCCAATGGTCGTGGACGAAGGGCTAA 

7062 7072 7082 7092 7102 
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COBL PROTEIN 



FIRST RESIDUE 
LAST RESIDUE 



NUMBER 



NO, 



% 
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10 


2.42 


13 


GLN 


Q 


7 


. • '1.69 


14 


ASN 


N 


5 


1.21 


15 


LYS 


K 


6 


1.45 


16 


ASP 


D 


19 


4.60 


17 


GLU 


E 


27 


6.54 


18 


CYS 


C 


1 


0.24 


19 


TRP 


W 


8 


1.94 


20 


ARG 


R 


28 


6.78 


21 


GLY 


G 


51 


12.35 


22 






0 


0.00 



1 

= 413 

WEIGHT 

1176.56 
5314.76 
2940.08 
1179.36 
3368.38 
2175.75 
2329.20 
2122.05 
3978.24 
163.06 
0.00 
1370.60 
896.42 
570.20 
768.54 
2185.57 
3484.08 
103.01 
1488.64 
4370.80 
2908.02 
0.00 



WEIGHT 

2.74 
12.39 
6.85 
2.75 
7.85 
5.07 
5.43 
4.95 
9.27 
0.38 
0.00 
3.19 
2.09 
1.33 
1.79 
5.09 
8.12 
0.24 
3.47 
10.19 
6.78 
0.00 



RESIDUES 

MOLECULAR WEIGHT 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 



413 

= 42911. 

36. 
5.70 



OD 260 (Img/ml) = 0.754 OD 280 (Img/ml) = 1.064 



COBL FROM 1 TO 413 




-1.80 



41 82 123 164 205 246 287 328 369 410 
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cobM GENE (SEQ ID NO: 27) AND COBM PROTEIN (SEQ ID NO: 28) 
SEQUENCE OF THE 8753-BP FRAGMENT FROM 7172 TO 7930 

MetThrValHisPhelleGlyAlaGlyProGlyAlaAlaAspLeuIleThrValArgGlyArgAspLeuIleGly 
ATGACGGTACATTTCATCGGCGCCGGCCCGGGAGCCGCAGACCTGATCACGGTGCGT6GTCGCGACCTGATCGGG 

7172 7182 7192 7202 7212 7222 7232 

ArgCysProValCysLeuTyrAlaGlySerlleValSerProGluLeuLeuArgTyrCysProProGlyAlaArg 
CGCTGCCCGGTCTGCCTTTACGCCGGCTCGATCGTCTCGCCGGAGCTGCTGCGATATTGCCCGCCGGGCGCCCGC 

7247 7257 7267 7277 7287 7297 7307 

IleValAspThrAlaProMetSerLeuAspGluIleGluAlaGluTyrValLysAlaGluAlaGluGlyLeuAsp 
ATTGTCGATACGGCGCCGATGTCCCTCGACGAGATCGAGGCGGAGTATGTGAAGGCCGAAGCCGAAGGGCTCGAC 

7322 7332 7342 7352 7362 7372 7382 

ValAlaArgLeuHisSerGlyAspLeuSerValTrpSerAlaValAlaGluGlnlleArgArgLeuGluLysHis 
GTGGCGCGGCTTCATTCGGGCGACCTTTCGGTCTGGAGTGCTGTGGCCGAACAGATCCGCCGGCTCGAGAAGCAT 

7397 7407 7417 7427 7437 7447 7457 

GlylleAlaTyrThrMetThrProGlyValProSerPheAlaAlaAlaAlaSerAlaLeuGlyArgGluLeuThr 
GGCATCGCCTATACGATGACGCCGGGCGTTCCTTCCTTTGCGGCGGCGGCTTCAGCGCTCGGTCGCGAATTGACC 

7472 7482 7492 7502 7512 7522 7532 

IleProAlaValAlaGlnSerLeuValLeuThrArgValSerGlyArgAlaSerProMetProAsnSerGluThr 
ATTCCGGCCGTGGCCCAGAGCCTGGTGCTGACCCGCGTTTCGGGCCGCGCCTCGCCGATGCCGAACTCAGAAACG 

7547 7557 7567 7577 7587 7597 7607 

LeuSerAlaPheGlyAlaThrGlySerThrLeuAlalleHisLeuAlalleHisAlaLeuGlnGlnValValGlu 
CTTTCCGCTTTCGGCGCTACGGGATCGACGCTGGCAATCCACCTTGCGATCCATGCGCTTCAGCAGGTGGTCGAG 

7622 7632 7642 7652 7662 7672 7682 

GluLeuThrProLeuTyrGlyAlaAspCysProValAlalleValValLysAlaSerTrpProAspGluArgVal 
GAACTGACGCCGCTCTACGGTGCCGACTGCCCGGTCGCCATCGTCGTCAAGGCCTCCTGGCCGGACGAACGCGTG 

7697 7707 7717 7727 7737 7747 7757 

ValArgGlyThrLeuGlyAspIleAlaAlaLysValAlaGluGluProIleGluArgThrAlaLeuIlePheVal 
GTGCGCGGCACGCTCGGTGACATCGCCGCCAAGGTGGCGGAAGAGCCGATCGAGCGCACGGCGCTGATCTTCGTC 

7772 7782 7792 7802 7812 7822 7832 

GlyProGlyLeuGluAlaSerAspPheArgGluSerSerLeuTyrAspProAlaTyrGlnArgArgPheArgGly 
GGTCCGGGGCTCGAAGCCTCCGATTTCCGTGAAAGCTCGCTCTACGATCCCGCCTATCAGCGGCGCTTCAGAGGG 

7847 7857 7867 7877 7887 7897 7907 
ArgGlyGlu 
CGCGGCGAA 

7922 7932 7942 7952 7962 7972 7982 
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COBM PROTEIN 



FIRST RESIDUE = 1 
LAST RESIDUE = 253 









NUMBER 


NO. -6 


WEIGHT 


WEIGHT 


1 


PHE 


F 


6 
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3.29 
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LEU 
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9.49 
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1.91 


16 
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17 
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E 


19 
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9.13 


18 


CYS 


C 


4 
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412.04 


1.53 


19 


TRP 


W 


2 


0.79 


372.16 


1.39 


20 
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R 


19 


7.51 


2965.90 


11.05 


21 


GLY 


G 


22 


8.70 


1254.44 


4.67 


22 






0 


0.00 


0.00 


0.00 




RESIDUES 








253 




MOLECULAR 


WEIGHT 




= 26846. 




INDEX 


OF 


POLARITY (%) 






38. 




ISOELECTRIC POINT 






5.58 




OD 260 (Img/ml) = 0.461 


OD 280 


(Img/ml) = 


0.724 



COBM FROM 1 TO 253 




-1.50 



25 50 75 100 125 150 175 200 225 250 



F/G. 
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80 




RETENTION TIME ON THE COLUMN 



FIG. 20 
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1210 
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GCGCGGGCGC 


GATCCGGTCG 


TCCCGAGCCA 
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TCCGCTTACG 
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CATGATGAGC 
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TCGACATTTC 


CAACCTCCCC 


GACACCACGA 


CGAGACCTCT 


GTACTACTCG 


TTCTAACTGG 


AGCTGTAAAG 


GTTGGAGGGG 


CTGTGGTGCT 


1570 


1580 


1590 


1600 


1610 


1620 


TTTCCGTCCG 


GGAGGTTTTC 


GGTATTGATA 


CGGATTTGCG 


CGTTCCTGCC 


TATTCGAAGG 


AAAGGCAGGC 


CCTCCAAAAG 


CCATAACTAT 


GCCTAAACGC 


GCAAGGACGG 


ATAAGCTTCC 


1630 


1640 


1650 


1660 


1670 


1680 


GCGACGCCTA 


TGTCCCGGAT 


CTGGATCCGG 


ACTACCTCTT 


CGACCGCGAA 


ACGACGCTCG 


CGCTGCGGAT 


ACAGGGCCTA 


GACCTAGGCC 


TGATGGAGAA 


GCTGGCGCTT 


TGCTGCGAGC 


1690 


1700 


1710 


1720 


1730 


1740 


CCATTCTCGC 


AGGCTTCGCC 


CACAACCGAC 


GCGTGATGGT 


GTCGGGCTAT 


CACGGCACCG 


GGTAAGAGCG 


TCCGAAGCGG 


GTGTTGGCTG 


CGCACTACCA 


CAGCCCGATA 
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CCATATCGAG 


CAGGTCGCCG 
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GTCCAGCGGC 


GCGCGGAGTT 


GACCGGCACG 


CACGCGCAGT 
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ACCTCGATAG 


CCATGTCAGC 


CGTATCGACC 
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TCTATCACGG 


CACGCAGCAG 


ATCAACCAGG 


CGCAGATGGA 


CCGCTGGTCG 
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AGATAGTGCC 


GTGCGTCGTC 


TAGTTGGTCC 


GCGTCTACCT 


GGCGACCAGC 


TAGCAGTGGT 
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CGCTGAACTA 


CCTGCCGCAC 


GACAAGGAAG 
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AGCTGTAGCA 
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AGCGAAAGCG 


GAAGGCGCAC 


TGGAAGGAGT 
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CGAGCTGGAG 
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GGAATGCGCT 


GCCAACATCG 
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CGGTTGTAGC 


ACGAGCTTCG 


GTGGCGGACT 


AGGGTGCCGG 
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TGCCGTCCCC 


TTTGGGAGGG 


CGGGTCATGA 


CGCTGTGGCA 
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CGCCCCACTG 
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GCCCAGTACT 


GCGACACCGT 


TTGGCCTACT 


GCGGGGTGAC 
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GGGCGCCGTC 


GCCTCTGGCT 


GAAGAAGGAA 


CTGTCGTGAG 


CTCGAATTCG 
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CCCGCGGCAG 


CGGAGACCGA 


CTTCTTCCTT 


GACAGCACTC 


GAGCTTAAGC 


TTCCGTTTCG 
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CAACCACGCG 


CGAGAATGCT 


GCGGAACCGT 


TCAAGCGGGC 


GCTTTCCGGC 


TGCATCCGAT 


GTTGGTGCGC 


GCTCTTACGA 


CGCCTTGGCA 


AGTTCGCCCG 


CGAAAGGCCG 


ACGTAGGCTA 
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CGATCGCGGG 


CGATGCCGAG 


GTGGAAGTCG 


CCTTCGCCAA 


CGAGCGGCCG 


GGCATGACCG 


GCTAGCGCCC 


GCTACGGCTC 


CACCTTCAGC 


GGAAGCGGTT 


GCTCGCCGGC 


CCGTACTGGC 
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GCGAACGCAT 


CCGTCTGCCG 


GAACTTTCCA 


AGCGCCCGAC 


CCTGCAGGAA 


CTTGCCGTGA 


CGCTTGCGTA 


GGCAGACGGC 


CTTGAAAGGT 


TCGCGGGCTG 


GGACGTCCTT 


GAACGGCACT 


2830 
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2850 


2860 


2870 


2880 


CCCGCGGGCT 


CGGTGACAGC 


ATGGCGCTGC 


GCAAGGCCTG 


TACGCATGCG 


CGGATCCAGC 


GGGCGCCCGA 


GCCACTGTCG 


TACCGCGACG 


CGTTCCGGAC 


ATGCGTACGC 


GCCTAGGTCG 


2890 


2900 


2910 


2920 


2930 


2940 


GCACCATGTC 


GCCGCAAGGG 


GCGGACGCCC 


GCGCGATCTT 


CGATGCGGTG 


GAGCAGGCTC 


CGTGGTACAG 


CGGCGTTCCC 


CGCCTGCGGG 


CGCGCTAGAA 


GCTACGCCAC 


CTCGTCCGAG 
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2960 


2970 


2980 


2990 


3000 


GTGTCGAGGC 


GATCGGGTCG 


TTGCGCATGG 


CGGGTGTCGC 


CAAGAACCTC 


AACGTCATGC 


CACAGCTCCG 


CTAGCCCAGC 


AACGCGTACC 


GCCCACAGCG 


GTTCTTGGAG 


TTGCAGTACG 
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3010 
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3030 


TCGAAGAGAA 


ATACGCCAAG 


GCGAATTTCG 


AGCTTCTCTT 


TATGCGGTTC 


CGCTTAAAGC 
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TCGGCGAGGC 


CGTAGCGCTG 


CTGGTGCGCG 


AGCCGCTCCG 


GCATCGCGAC 


GACCACGCGC 


3130 


3140 


3150 


CTGCCGGCAA 


GGTGCTCGAC 


CTCTGGCGCG 


GACGGCCGTT 


CCACGAGCTG 


GAGACCGCGC 


3190 


3200 


3210 


TTGAGCACCT 


GTCGTCGACG 


ATCAACAACC 


AACTCGTGGA 


CAGCAGCTGC 


TAGTTGTTGG 


3250 
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3270 


TGCTGACCTC 


GATGGAAGTC 


GCCGAGAAAT 


ACGACTGGAG 


CTACCTTCAG 


CGGCTCTTTA 


3310 


3320 


3330 


AGGAAAGCGA 


GACCGACGAA 


GACCAGCCGC 


TCCTTTCGCT 


CTGGCTGCTT 


CTGGTCGGCG 


3370 


3380 


3390 


ACGAGGAAGC 


CGGCGACGAT 


GCCGCACCCG 


TGCTCCTTCG 


GCCGCTGCTA 


CGGCGTGGGC 


3430 


3440 


3450 


TGGAAGAAGG 


CGAGATGGAC 


GGCGCGGAGA 


ACCTTCTTCC 


GCTCTACCTG 


CCGCGCCTCT 


3490 


3500 


3510 


ACGAGGACAG 


CGAAACGCCC 


GGCGAGGTCA 


TGCTCCTGTC 


GCTTTGCGGG 


CCGCTCCAGT 


3550 


3560 


3570 


ACGAGAAGGT 


CGACTACGCC 


GTCTTCACCC 


TGCTCTTCCA 


GCTGATGCGG 


CAGAAGTGGG 
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3610 3620 3630 3640 3650 3660 

AGCTTTGCGA CGAGGCCGAG CTCGACCGGC TGCGCGCCTT CCTCGACAAG CAGCTTGCCC 
TCGAAACGCT GCTCCGGCTC GAGCTGGCCG ACGCGCGGAA GGAGCTGTTC GTCGAACGGG 

3670 3680 3690 3700 3710 3720 

ATCTTCAAGG CGCGGTCGGC CGCCTTGCCA ACCGGCTGCA GCGCCGCCTG ATGGCGCAGC 
TAGAAGTTCC GCGCCAGCCG GCGGAACGGT TGGCCGACGT CGCGGCGGAC TACCGCGTCG 

3730 3740 3750 3760 3770 3780 

AGAACCGCTC CTGGGAGTTC GATCTCGAAG AGGGGTATCT CGATTCGGCG CGGCTTCAGC 
TCTTGGCGAG GACCCTCAAG CTAGAGCTTC TCCCCATAGA GCTAAGCCGC GCCGAAGTCG 

3790 3800 3810 3820 3830 3840 

GCATCATCAT CGATCCGATG CAGCCGCTTT CCTTCAAGCG CGAAAAGGAC ACCAACTTCC 
CGTAGTAGTA GCTAGGCTAC GTCGGCGAAA GGAAGTTCGC GCTTTTCCTG TGGTTGAAGG 

3850 3860 3870 3880 3890 3900 

GCGATACCGT CGTGACGCTG CTGATCGACA ATTCCGGCTC GATGCGCGGC CGTCCGATCA 
CGCTATGGCA GCACTGCGAC GACTAGCTGT TAAGGCCGAG CTACGCGCCG GCAGGCTAGT 

3910 3920 3930 3940 3950 3960 

CGGTTGCCGC CACCTGCGCC GATATCCTGG CGCGCACGCT CGAGCGCTGC GGCGTCAAGG 
GCCAACGGCG GTGGACGCGG CTATAGGACC GCGCGTGCGA GCTCGCGACG CCGCAGTTCC 

3970 3980 3990 4000 4010 4020 

TCGAGATCCT CGGTTTTACC ACCAAGGCGT GGAAGGGTGG GCAGTCACGC GAGAAGTGGC 
AGCTCTAGGA GCCAAAATGG TGGTTCCGCA CCTTCCCACC CGTCAGTGCG CTCTTCACCG 

4030 4040 4050 4060 4070 4080 

TGGCCGGCGG CAAGCCACAG GCCCCGGGTC GCCTCAACGA CCTGCGACAC ATCGTCTACA 
ACCGGCCGCC GTTCGGTGTC CGGGGCCCAG CGGAGTTGCT GGACGCTGTG TAGCAGATGT 

4090 4100 4110 4120 4130 4140 

AGTCTGCCGA CGCTCCGTGG CGCCGGGCAC GACGCAATCT CGGCCTGATG ATGCGGGAAG 
TCAGACGGCT GCGAGGCACC GCGGCCCGTG CTGCGTTAGA GCCGGACTAC TACGCCCTTC 

4150 4160 4170 4180 4190 4200 

GCCTGCTCAA GGAAAACATC GACGGCGAGG CGTTGATTTG GGCGCATGAG CGGCTGATGG 
CGGACGAGTT CCTTTTGTAG CTGCCGCTCC GCAACTAAAC CCGCGTACTC GCCGACTACC 
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4210 4220 4230 4240 4250 4260 

CGCGGCGCGA ACAGCGGCGC ATCCTGATGA TGATTTCGGA CGGCGCGCCG GTCGACGACT 
GCGCCGCGCT TGTCGCCGCG TAGGACTACT ACTAAAGCCT GCCGCGCGGC CAGCTGCTGA 

4270 4280 4290 4300 4310 4320 

CGACGCTGTC GGTCAATCCA GGAAACTATC TGGAGCGTCA CCTGCGCGCG GTCATCGAGC 
GCTGCGACAG CCAGTTAGGT CCTTTGATAG ACCTCGCAGT GGACGCGCGC CAGTAGCTCG 

4330 4340 4350 4360 4370 4380 

AGATCGAAAC GCGCTCGCCG GTGGAACTGC TGGCGATCGG TATCGGCCAC GACGTGACGC 
TCTAGCTTTG CGCGAGCGGC CACCTTGACG ACCGCTAGCC ATAGCCGGTG CTGCACTGCG 

4390 4400 4410 4420 4430 4440 

GCTACTATCG CCGTGCCGTC ACCATCGTCG ATGCCGATGA GCTTGCCGGC GCGATGACCG 
CGATGATAGC GGCACGGCAG TGGTAGCAGC TACGGCTACT CGAACGGCCG CGCTACTGGC 

4450 4460 4470 4480 4490 4500 

AACAGCTGGC CGCACTCTTC GAGGACGAAA GCCAGCGCCG CGGTTCTTCG CGTCTTCGCC 
TTGTCGACCG GCGTGAGAAG CTCCTGCTTT CGGTCGCGGC GCCAAGAAGC GCAGAAGCGG 

4510 4520 4530 4540 4550 4560 

GCGCCGGGTG ATGCTTCCCC CTTGGGGGCG GTGGAACATC GCCTCCGAGC TGCCAATCGG 
CGCGGCCCAC TACGAAGGGG GAACCCCCGC CACCTTGTAG CGGAGGCTCG ACGGTTAGCC 

4570 4580 4590 4600 4610 4620 

CACCTGCACG CATCGCTGGC GGCCGAAGTC AATTTACGGA CATAGTTTTA CAGTCTACCA 
GTGGACGTGC GTAGCGACCG CCGGCTTCAG TTAAATGCCT GTATCAAAAT GTCAGATGGT 

4630 4640 4650 4660 4670 4680 

AGCTACCATG CGTGGCGGGC TCACTTTGAG CGCACGCCGC GTCATTCCCG ATGCCCCCTG 
TCGATGGTAC GCACCGCCCG AGTGAAACTC GCGTGCGGCG CAGTAAGGGC TACGGGGGAC 

4690 4700 4710 4720 4730 4740 

AAGGTACTTC TCTTGATGCT TGGCCGCGGT CTCCTAGCCC TTTTCCTCCT GGCTTCGGCC 
TTCCATGAAG AGAACTACGA ACCGGCGCCA GAGGATCGGG AAAAGGAGGA CCGAAGCCGG 

4750 4760 4770 4780 4790 4800 

TGCCCGGC 
ACGGGCCG 
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10 20 30 40 50 60 

GAGCTCATAG AGCAGTTCCT CGATCGACTT CAGCAGTCGC ATGAAATCCA TGCCGTGCTC 
CTCGAGTATC TCGTCAAGGA GCTAGCTGAA GTCGTCAGCG TACTTTAGGT ACGGCACGAG 

70 80 90 100 110 120 

CCCTTGCTTC TATGCGTGGC ACGACCGCGC GCCGGGGCCG ATGCCGGTCA GTCGCGCAGA 
GGGAACGAAG ATACGCACCG TGCTGGCGCG CGGCCCCGGC TACGGCCAGT CAGCGCGTCT 

130 140 150 160 170 180 

CGCAGCTCGT CGGTACGCAT CTGCAGCATC TCCAGCGTCG ACAGGAAGCT CATGCCGAGC 
GCGTCGAGCA GCCATGCGTA GACGTCGTAG AGGTCGCAGC TGTCCTTCGA GTACGGCTCG 

190 200 210 220 230 240 

AGGCTCTGAT CGAGCTTGCC CTTGGCTGCG ACCGTTGCGC CGATGTTGCG GCGGGTGATC 
TCCGAGACTA GCTCGAACGG GAACCGACGC TGGCAACGCG GCTACAACGC CGCCCACTAG 

250 260 270 280 290 300 

GGGCCGATCG AGATCTCCTG AAGCATCACG GGGGCTGCCT GGGCCCGGCC ATTGGCTGTC 
CCCGGCTAGC TCTAGAGGAC TTCGTAGTGC CCCCGACGGA CCCGGGCCGG TAACCGACAG 

310 320 330 340 350 360 

ATGACCGTGA CGATAAAGTT GAGGTTGGCC GGGTCGAGGC CGATCTTTTC CGCATCTTCA 
TACTGGCACT GCTATTTCAA CTCCAACCGG CCCAGCTCCG GCTAGAAAAG GCGTAGAAGT 

370 380 390 400 410 420 

TAGGTGAGCG CGATGTTGCT GGCGCCGGTA TCGACCAGCA TGCTGATGTC CTTGCCGTCG 
ATCCACTCGC GCTACAACGA CCGCGGCCAT AGCTGGTCGT ACGACTACAG GAACGGCAGC 

430 440 450 460 470 480 

ACCGTCGCAG TGGTCTCGAA ATGACCGTTC AGCATCTTCT GCAGCACCAC TTCCTGCTGT 
TGGCAGCGTC ACCAGAGCTT TACTGGCAAG TCGTAGAAGA CGTCGTGGTG AAGGACGACA 

490 500 510 520 530 540 

CCCTCGCTGT CAGTGATGAT GGTGGC.GCGG CCGGGGATGA GGCCGGCGAG CAGGCGGTTA 
GGGAGCGACA GTCACTACTA CCACCGCGCC GGCCCCTACT CCGGCCGCTC GTCCGCCAAT 

550 560 570 580 590 600 

CCGAAGCCCT CCAACTCGAA GCGGTAGACA TAGGCCGAGA CCAGCGCCAG AACGACGAAG 
GGCTTCGGGA GGTTGAGCTT CGCCATCTGT ATCCGGCTCT GGTCGCGGTC TTGCTGCTTC 
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610 620 630 640 650 660 

AGCCAGATGG CGATCTGACG CAGGCCTTCG CCGAAGCGGT GGCGGCTCTG CAGGATGCCG 
TCGGTCTACC GCTAGACTGC GTCCGGAAGC GGCTTCGCCA CCGCCGAGAC GTCCTACGGC 

670 680 690 700 710 720 

GCGCCGATCA GCGTGGCGAT GGCGCCGAGC GAGACCAGTT GCCCGAACTG GTCATTGGCA 
CGCGGCTAGT CGCACCGCTA CCGCGGCTCG CTCTGGTCAA CGGGCTTGAC CAGTAACCGT 

730 740 750 760 770 780 

AGCCCCATGG TGCGGCCGGT GTCGTGGTTG ATGATCAGCA GGATGAGGCC GATGGCCAGG 
TCGGGGTACC ACGCCGGCCA CAGCACCAAC TACTAGTCGT CCTACTCCGG CTACCGGTCC 

790 800 810 820 830 840 

ATCGAGAGCA GGATGGCAAG ACGGGTCATG CTTCGCCGCG TTCCCTCGCC ATGCGCGTGC 
TAGCTCTCGT CCTACCGTTC TGCCCAGTAC GAAGCGGCGC AAGGGAGCGG TACGCGCACG 

850 860 870 880 890 900 

GTCGGGTTTC GCGCCGCGGC TTGCGTTCGA CGGTCTCAAG CCGTGCAGGC AACGCGCTCA 
CAGCCCAAAG CGCGGCGCCG AACGCAAGCT GCCAGAGTTC GGCACGTCCG TTGCGCGAGT 

910 920 930 940 950 960 

TGATCGCGCG GCGTTCGGCA TCGGTATAGA GCGTCCAGCG TCCGACTTCG TCGCGGGTAC 
ACTAGCGCGC CGCAAGCCGT AGCCATATCT CGCAGGTCGC AGGCTGAAGC AGCGCCCATG 

970 980 990 1000 1010 1020 

GGCCGCAGCC GAAACAGTAG CCGGTCTTGT CATCGATCGA ACAGACGAGA ATGCAGGGAG 
CCGGCGTCGG CTTTGTCATC GGCCAGAACA GTAGCTAGCT TGTCTGCTCT TACGTCCCTC 

1030 1040 1050 1060 1070 1080 

ATTCCATGGG CGTGCTCAGT TTTCCCTTGA TATATCGATG TTTCAAACCG TCAGCGCAAG 
TAAGGTACCC GCACGAGTCA AAAGGGAACT ATATAGCTAC AAAGTTTGGC AGTCGCGTTC 

1090 1100 1110 1120 1130 1140 

GGCACCGAGC ACGGCGATTT CGGTCAGTTG CTGCGTCGCC CCGATCGTGT CGCCCGTTTG 
CCGTGGCTCG TGCCGCTAAA GCCAGTCAAC GACGCAGCGG GGCTAGCACA GCGGGCAAAC 

1150 1160 1170 1180 1190 1200 

TCCGCCGATC TTGCGCATCG CCAGCCGAGC GAAGCCCTTG ACCGTGGCAA GGAATGCGAC 
AGGCGGCTAG AACGCGTAGC GGTCGGCTCG CTTCGGGAAC TGGCACCGTT CCTTACGCTG 
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1210 1220 1230 1240 1250' 1260 

GAGCGCCGCG ATGACGCCGA GCGCCGGGAC CTGCGCGAGA TAGAAGAGCA GCATTGCGAC 
CTCGCGGCGC TACTGCGGCT CGCGGCCCTG GACGCGCTCT ATCTTCTCGT CGTAACGCTG 

1270 1280 1290 1300 1310 1320 

AAGAAGTCCG AAGGCAAGCG CGAAGCGCGT GGCCGCCGGT TCCGGCTCGC CAGCCGAGGC 
TTCTTCAGGC TTCCGTTCGC GCTTCGCGCA CCGGCGGCCA AGGCCGAGCG GTCGGCTCCG 

1330 1340 1350 1360 1370 1380 

CGCGACGCCG CTGCTGCGCG CCGGCGGAAG CGACGACCAG TGCCAGACCA TGGCGGCGCG 
GCGCTGCGGC GACGACGCGC GGCCGCCTTC GCTGCTGGTC ACGGTCTGGT ACCGCCGCGC 

1390 1400 1410 1420 1430 1440 

GCTGAGGCAC GCTGCGCCAA GGATCGCCAT GGCGGCGCCC AGCGGCGAAA AGAGCGGCAG 
CGACTCCGTG CGACGCGGTT CCTAGCGGTA CCGCCGCGGG TCGCCGCTTT TCTCGCCGTC 

1450 1460 1470 1480 1490 1500 

GATCGAGGCG AACGCCGAGA CGCGCAGGCC GAAGGAGAGG ATGAGGGCGA CGGCCGCATA 
CTAGCTCCGC TTGCGGCTCT GCGCGTCCGG CTTCCTCTCC TACTCCCGCT GCCGGCGTAT 

1510 1520 1530 1540 1550 1560 

GGTGCCGATG CGGCTGTCCT TCATGATCGC AAGCGCCGCT TCGCGGTCGC GACCGCCGCC 
CCACGGCTAC GCCGACAGGA AGTACTAGCG TTCGCGGCGA AGCGCCAGCG CTGGCGGCGG 

1570 1580 1590 1600 1610 1620 

AAAGCCATCG GCCGTGTCGC CAAGCCCGTC TTCGTGCAGT GCGCCGGTGA CAAGCGCCTG 
TTTCGGTAGC CGGCACAGCG GTTCGGGCAG AAGCACGTCA CGCGGCCACT GTTCGCGGAC 

1630 1640 1650 1660 1670 1680 

GATGGCGACG ACGACAAAGG CGGCAAAGAG CGAGCTCACC TGCAGCGCCA TGAGGGCCAT 
CTACCGCTGC TGCTGTTTCC GCCGTTTCTC GCTCGAGTGG ACGTCGCGGT ACTCCCGGTA 

1690 1700 1710 1720 1730 1740 

GGCGACGGCC GCCGATGGCA GTGCGATCGC CAGGCCGGCG AACGGGAAGG CGCGCACGGC 
CCGCTGCCGG CGGCTACCGT CACGCTAGCG GTCCGGCCGC TTGCCCTTCC GCGCGTGCCG 

1750 1760 1770 1780 1790 1800 

ACGGCTCAAG CGCCCGTCAT AACCTTCGAA ATGACGCGCA GGCATCGGGA TGCGGCTGAG 
TGCCGAGTTC GCGGGCAGTA TTGGAAGCTT TACTGCGCGT CCGTAGCCCT ACGCCGACTC 
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1810 


1820 


1830 


1840 


1850 


1860 


AAAGCCGATC 


GACCGCGCCA 


CATCGTCACA 


GAAATCGCCA 


ACGAAGCCCA 


TGGCTCCTCC 


TTTCGGCTAG 


CTGGCGCGGT 


GTAGCAGTGT 


CTTTAGCGGT 


TGCTTCGGGT 


ACCGAGGAGG 


1870 


1880 


1890 


1900 


1910 


1920 


AAGGTTGCGG 


CCATTGACCC 


GGCCGCTGCC 


AAACTCGCCG 


ACTGCGGCGA 


GTCTCGCAAG 


TTCCAACGCC 


GGTAACTGGG 


CCGGCGACGG 


TTTGAGCGGC 


TGACGCCGCT 


CAGAGCGTTC 


1930 


1940 


1950 


1960 


1970 


1980 


CCGGGCGGGC 


GCACCCGCGA 


GGGCCGCGCA 


CACTTTTCCC 


AGACCTTTCA 


TAGGCCGTCT 


GGCCCGCCCG 


CGTGGGCGCT 


CCCGGCGCGT 


GTGAAAAGGG 


TCTGGAAAGT 


ATCCGGCAGA 


1990 


2000 


2010 


2020 


2030 


2040 


GCGACCGCTC 


GCGGATCGAG 


ACGGCGACGC 


CGATTGGCGC 


AAATGTCGTT 


GCCCGAATTT 


CGCTGGCGAG 


CGCCTAGCTC 


TGCCGCTGCG 


GCTAACCGCG 


TTTACAGCAA 


CGGGCTTAAA 


2050 


2060 


2070 


2080 


2090 


2100 


TCGGCGCCCT 


CTATGAGGGG 


CGTAGATAGA 


GCTTCACGAT 


GATGCAAGGA 


TTCCTCCCAT 


AGCCGCGGGA 


GATACTCCCC 


GCATCTATCT 


CGAAGTGCTA 


CTACGTTCCT 


AAGGAGGGTA 


2110 


2120 


2130 


2140 


2150 


2160 


GAGTGCCAGC 


GGCCTGCCGT 


TTGATGATTT 


TCGCGAATTG 


TTGCGCAACC 


TGCCGGGCCC 


CTCACGGTCG 


CCGGACGGCA 


AACTACTAAA 


AGCGCTTAAC 


AACGCGTTGG 


ACGGCCCGGG 


2170 


2180 


2190 


2200 


2210 


2220 


GGATGCGGCA 


GCCCTCGTTG 


CCGCGCGGGA 


GCGGGACGCC 


CAGCTGACGA 


AGCCGCCGGG 


CCTACGCCGT 


CGGGAGCAAC 


GGCGCGCCCT 


CGCCCTGCGG 


GTCGACTGCT 


TCGGCGGCCC 


2230 


2240 


2250 


2260 


2270 


2280 


CGCGCTCGGC 


CGCCTCGAGG 


AAATCGCCTT 


CTGGCTCGCC 


GCCTGGACGG 


GCAAGGCGCC 


GCGCGAGCCG 


GCGGAGCTCC 


TTTAGCGGAA 


GACCGAGCGG 


CGGACCTGCC 


CGTTCCGCGG 


2290 


2300 


2310 


2320 


2330 


2340 


GGTGGTCAAC 


CGGCCGCTGG 


TGGCGATCTT 


TGCCGGCAAC 


CACGGCGTCA 


CCCGCCAGGG 


CCACCAGTTG 


GCCGGCGACC 


ACCGCTAGAA 


ACGGCCGTTG 


GTGCCGCAGT 


GGGCGGTCCC 


2350 


2360 


2370 


2380 


2390 


2400 


GGTGACCCCG 


TTCCCGTCAT 


CCGTCACCGC 


ACAGATGGTC 


GAGAATTTTG 


CCGCCGGTGG 


CCACTGGGGC 


AAGGGCAGTA 


GGCAGTGGCG 


TGTCTACCAG 


CTCTTAAAAC 


GGCGGCCACC 
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2410 2420 2430 2440 2450 2460 

CGCTGCGATC AACCAGATCT GCGTCAGCCA CGACCTCGGG CTGAAGGTCT TCGACCTCGC 
GCGACGCTAG TTGGTCTAGA CGCAGTCGGT GCTGGAGCCC GACTTCCAGA AGCTGGAGCG 

2470 2480 2490 2500 2510 2520 

ACTCGAATAC CCGACCGGTG ATATCACCGA GGAAGCCGCG CTGTCCGAGC GCGATTGCGC 
TGAGCTTATG GGCTGGCCAC TATAGTGGCT CCTTCGGCGC GACAGGCTCG CGCTAACGCG 

2530 2540 2550 2560 2570 2580 

CGCGACCATG GCCTTTGGCA TGGAGGCGAT TGCCGGCGGC ACGGATCTTC TGTGCATCGG 
GCGCTGGTAC CGGAAACCGT ACCTCCGCTA ACGGCCGCCG TGCCTAGAAG ACACGTAGCC 

2590 2600 2610 2620 2630 2640 

CGAAATGGGC ATCGGCAACA CCACGATCGC GGCCGCGATC AATCTCGGCC TTTATGGTGG 
GCTTTACCCG TAGCCGTTGT GGTGCTAGCG CCGGCGCTAG TTAGAGCCGG AAATACCACC 

2650 2660 2670 2680 2690 2700 

CACGGCCGAA GAATGGGTCG GTCCGGGTAC CGGCTCCGAG GGCGAGGTGC TGAAGCGCAA 
GTGCCGGCTT CTTACCCAGC CAGGCCCATG GCCGAGGCTC CCGCTCCACG ACTTCGCGTT 

2710 2720 2730 2740 2750 2760 

GATCGCCGCG GTCGAAAAGG CCGTGGCGCT GCATCGCGAT CACCTGTCCG ATCCGCTCGA 
CTAGCGGCGC CAGCTTTTCC GGCACCGCGA CGTAGCGCTA GTGGACAGGC TAGGCGAGCT 

2770 2780 2790 2800 2810 2820 

ACTGATGCGT CGCCTCGGCG GTCGTGAGAT CGCGGCCATG GCTGGCGCCA TCCTGGCCGC 
TGACTACGCA GCGGAGCCGC CAGCACTCTA GCGCCGGTAC CGACCGCGGT AGGACCGGCG 

2830 2840 2850 2860 2870 2880 

CCGCGTCCAG AAGGTACCTG TCATCATCGA CGGCTACGTG GCGACCGCTG CGGCTTCGAT 
GGCGCAGGTC TTCCATGGAC AGTAGTAGCT GCCGATGCAC CGCTGGCGAC GCCGAAGCTA 

2890 2900 2910 2920 2930 2940 

CCTGAAGGCG GCCAACCCGT CGGCCCTCGA CCATTGCCTG ATCGGCCATG TTTCGGGCGA 
GGACTTCCGC CGGTTGGGCA GCCGGGAGCT GGTAACGGAC TAGCCGGTAC AAAGCCCGCT 

2950 2960 2970 2980 2990 3000 

ACCGGGGCAT CTGCGCGCGA TCGAGAAGCT CGGCAAGACG CCGCTGCTGG CACTCGGCAT 
TGGCCCCGTA GACGCGCGCT AGCTCTTCGA GCCGTTCTGC GGCGACGACC GTGAGCCGTA 

FIG. 33E 



'1^ 



Frands BLANCHE et al. 
USAPN: Oa/426.630 106 of 189 
Atty. Docket 3806.0050-01 



3010 


3020 


3030 


3040 


3050 


3060 


GCGGCTTGGC 


GAAGGCACGG 


GCGCGGCCCT 


TGCCGCCGGT 


ATCGTCAAGG 


CGGCGGCCGC 

w VJ w \j \j v> VJ V-/ 


CGCCGAACCG 


CTTCCGTGCC 


CGCGCCGGGA 


ACGGCGGCCA 


TAGCAGTTCC 


GCCGCCGGCG 


3070 


3080 


3090 


3100 


3110 


3120 


TTGCCACAGC 


GGCATGGCGA 


CCTTTGCCCA 


GGCCGGCGTC 


AGCAACAAGG 


AATAGTGAAG 

X XX X X. X X ^^X XX X^J 


AACGGTGTCG 


CCGTACCGCT 


GGAAACGGGT 


CCGGCCGCAG 


TCGTTGTTCC 


TTATCACTTC 


3130 


3140 


3150 


3160 


3170 


3180 

sj A. \J w 


TTCCGGCCGG 


GCTTTGCAGG 


AAGGCCGGCC 


GGTTTCTGTC 


CAAGGCCTGT 


CACGGGCGCG 


AAGGCCGGCC 


CGAAACGTCC 


TTCCGGCCGG 


CCAAAGACAG 


GTTCCGGACA 


GTGCCCGCGC 


3190 


3200 


3210 


3220 


3230 


3240 


AAGCTGTCGC 


GTGCCGGGCC 


TTGATGGATG 


CGTCCTTCTC 


GCCTATCCAA 


AGCGCAAATG 


TTCGACAGCG 


CACGGCCCGG 


AACTACCTAC 


GCAGGAAGAG 


CGGATAGGTT 


TCGCGTTTAC 


3250 


3260 


3270 


3280 


3290 


3300 

J ^ VJ V 


CGCGCCCTAG 


CTATAGTCTT 


GGGTGCCTGC 


AACCGAGACC 

X XX X^^ \^ V^X XVJX \^ 


GCCTTGCATT 


CGCCTCAATC 


GCGCGGGATC 


GATATCAGAA 


CCCACGGACG 


TTGGCTCTGG 


CGGAACGTAA 


GCGGAGTTAG 


3310 


3320 


3330 


3340 


3350 


3360 


ACGATGTCGA 


AGCAAGCACA 


GTTTCAAGCC 


CTGTCGAGAC 


GAAATGGACG 


CCAAGAACAC 


TGCTACAGCT 


TCGTTCGTGT 


CAAAGTTCGG 


GACAGCTCTG 


CTTTACCTGC 


GGTTCTTGTG 


3370 


3380 


3390 


3400 


3410 


3420 


CACGCACCGC 


ATTGGACAGA 


CGGGTCCTGT 


CGAGAAGCAG 


ACCGGCATTC 


GGCATCTCTT 


GTGCGTGGCG 


TAACCTGTCT 


GCCCAGGACA 


GCTCTTCGTC 


TGGCCGTAAG 


CCGTAGAGAA 


3430 


3440 


3450 


3460 

*J T U U 


3470 


Jt 0 u 


TGCCGCTGCG 


AGCTATTCGC 


TCGGCGGCGC 


CAAGCGGCTG 




pmpppmmmpp 
i VJV^V^ i i i V/V3 


ACGGCGACGC 


TCGATAAGCG 


AGCCGCCGCG 








3490 


3500 


3510 


3520 


3530 


3540 


CCACGAGCTG 


ATCGCCTTTG 


CCGCCGCGAT 


GATCGCTTTC 


ATCATCGTCG 


GCGCAACCTT 


GGTGCTCGAC 


TAGCGGAAAC 


GGCGGCGCTA 


CTAGCGAAAG 


TAGTAGCAGC 


CGCGTTGGAA 


3550 


3560 


3570 


3580 


3590 


3600 


CTTCCAATAT 


GTGGCGATGG 


CGATCCTGTT 


CCTGCTGATG 


ATGGCCTTCG 


AGGCGATCAA 


GAAGGTTATA 


CACCGCTACC 


GCTAGGACAA 


GGACGACTAC 


TACCGGAAGC 


TCCGCTAGTT 
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3610 3620 3630 3640 3650 3660 

CACGGCAATC GAGGAAATTG TCGATCGCGT TTCTCCCGAA ATCTCGGAAA TGGGTAAGAA 
GTGCCGTTAG CTCCTTTAAC AGCTAGCGCA AAGAGGGCTT TAGAGCCTTT ACCCATTCTT 

3670 3680 3690 3700 3710 3720 

CGCCAAGGAT CTCGGCTCCT TCGCCTGCCT CTGCCTGATT GTCGCCAACG GTGTCTATGC 
GCGGTTCCTA GAGCCGAGGA AGCGGACGGA GACGGACTAA CAGCGGTTGC CACAGATACG 

3730 3740 3750 3760 3770 3780 

CGCCTATGTC GTGATCTTCG ACGGCTTCAT GAACTGACCG GCTAGCGGGC CGGCGCCTTC 
GCGGATACAG CACTAGAAGC TGCCGAAGTA CTTGACTGGC CGATCGCCCG GCCGCGGAAG 

3790 3800 3810 3820 3830 3840 

ACCCGATAAA GCACATGCGG ACGCAGCGGG TTGCCCCCGG GTACCGTGAC GTCGTCGAAA 
TGGGCTATTT CGTGTACGCC TGCGTCGCCC AACGGGGGCC CATGGCACTG CAGCAGCTTT 

3850 3860 3870 3880 3890 3900 

TCATCAGCCG GATCC 
AGTAGTCGGC CTAGG 
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SEQUENCE: 4749-BP FRAGMENT FROM 200 TO 800 LENGTH = 601 \ 



FRAME 3 



FRAME 2 



FRAME 1 




739 679 619 559 499 439 379 319 259 




\ATG (660) 



TAG (379) 



OPEN READING FRAME 14 



A= ATG 
*: STOP 

OS: COMPLEMENTARY STRAND 
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SEQUENCE: 4749-BP FRAGMENT FROM 800 TO 1500 LENGTH = 701 

V 



FRAME 3 \ 



FRAME2 : 



FRAME 1 



GTG(925) 








869 939 1009 1079 1149 1219 1289 1359 1429 



CS 

FRAME 3 
CS 

FRAME 2 
CS 

FRAME 1 




A= ATG 
*: STOP 

CS: COMPLEMENTARY STRAND 



OPEN READING FRAME 15 



A 
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sequence: 4749-bp fragment from 1450 to 2600 length =1151 bp 
frames; 



FRAME2:' 



FRAME 1: 







^ ^ \. 




▼ATG(1512) 




TGA(2510) 




1 




- II 1 




1 1 III k/y. 

. h A 


1564 1679 


1794 


1909 2024 2139 2254 2369 2484 




A= ATG 
*: STOP 

CS: COMPLEMENTARY STRAND 



OPEN READING FRAME 16 
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SEQUENCE: 4749-BP FRAGMENT FROM 2500 TO 4650 LENGTH = 21 51 
FRAMES: 



FRAME 2: 



FRAME 1: 



CS 

FRAMES 
CS 

FRAME 2 
CS 

FRAME 1 







'^GTG(2616) 


TGA(4511) 




/ 

>VM A A. / 


rx^Aj^^ 




2714 2929 3144 


3359 3574 3789 4004 4219 4434 






Ac ' ■' 







A= ATG 
*: STOP 

CS: COMPLEMENTARY STRAND 



OPEN READING FRAME 17 
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SEQUENCE: 3855-BP FRAGMENT FROM 1 TO 905 LENGTH = 905 



\ 



FRAMES 



FRAME 2 




FRAME 1 




I III! I 





810 720 630 540 450 360 270 180 90 



CS 

FRAMES 
CS 

FRAME 2 
CS 

FRAME 1 




I I 



ATG (809) 



TGA (108)4 



I I 




A= ATG 
*: STOP 

CS: COMPLEMENTARY STRAND 



OPEN READING FRAME 18 



A 
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1989 1874 1759 1644 1529 1414 1299 1184 1069 




*: STOP 

CS: COMPLEMENTARY STRAND 
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SEQUENCE: 3855^P FRAGMENT FROM 2000 TO 3300 LENGTH = 1301 



FRAMES 



FRAME 2 i 



FRAME 1 : 



I II 




ATG(2099) 



I I 




2129 2259 2389 2519 2649 2779 2909 3039 3169 




A = ATG OPEN READING FRAME 20 

*: STOP 

CS: COMPLEMENTARY STRAND 
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SEQUENCE: 3855-BP FRAGMENT FROM 3250 TO 3855 LENGTH = 606 



FRAMES 



FRAME 2 



FRAME 1 




I I I 



ATG (3344) 



TGA(3757) 





3309 3369 3429 3489 3549 3609 3669 3729 3789 



CS 

FRAMES 
CS 

FRAME 2 
CS 

FRAME 1 




A= ATG 
*: STOP 

CS: COMPLEMENTARY STRAND 



OPEH READING FRAME 21 
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NAME = COBS 



FIRST RESIDUE = 1 
LAST RESIDUE = 332 
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RESIDUES 

MOLECULAR WEIGHT (MONOISOTOPIC) 
MOLECULAR WEIGHT (AVERAGE) 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 

OD 260 (1 mg/ml) = 0.611 OD 280 (1 mg/ml) = 0.891 

HYDROPHILICITY 
COBS FROM 1 TO 332 



332 

36960.0000 
36983.1797 
44.88 
6.34 




-1.80 



33 66 99 132 165 198 231 264 297 330 



FIG. 40A 
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cobs GENE (SEQ ID NO: 31) AND COBS PROTEIN (SEQ ID NO: 32) 
SEQUENCE OF THE 4749-BP Sal I - Sal I "Sal I -Sail -Sal I -Bgl I FRAGMENT 

FROM 1512 TO 2510 

MetMetSerLysIleAspLeuAspIleSerAsnLeuProAspThrThrlleSerValArgGluValPheGlylle 
ATGATGAGCAAGATTGACCTCGACATTTCCAACCTCCCCGACACCACGATTTCCGTCCGGGAGGTTTTCGGTATT 

1521 1531 1541 1551 1561 1571 1581 
AspThrAspLeuArgValProAlaTyrSerLysGlyAspAlaTyrValProAspLeuAspProAspTyrLeuPhe 
GATACGGATTTGCGCGTTCCTGCCTATTCGAAGGGCGACGCCTATGTCCCGGATCTGGATCCGGACTACCTCTTC 

1596 1606 1616 1626 1636 1646 1656 
AspArgGluThrThrLeuAlalleLeuAlaGlyPheAlaHisAsnArgArgValMetValSerGlyTyrHisGly 
GACCGCGAAACGACGCTCGCCATTCTCGCAGGCTTCGCCCACAACCGACGCGTGATGGTGTCGGGCTATCACGGC 

1671 1681 1691 1701 1711 1721 1731 
ThrGlyLysSerThrHisIleGluGlnValAlaAlaArgLeuAsnTrpProCysValArgValAsnLeuAspSer 
ACCGGCAAGTCCACCCATATCGAGCAGGTCGCCGCGCGCCTCAACTGGCCGTGCGTGCGCGTCAACCTCGATAGC 

1746 1756 1766 1776 1786 1796 1806 
HisValSerArglleAspLeuValGlyLysAspAlalleValValLysAspGlyLeuGlnValThrGluPheLys 
CATGTCAGCCGTATCGACCTCGTCGGCAAGGACGCGATCGTCGTCAAGGACGGCCTGCAGGTCACCGAATTCAAG 

1821 1831 1841 1851 1861 1871 1881 
AspGlylleLeuProTrpAlaTyrGlnHisAsnValAlaLeuValPheAspGluTyrAspAlaGlyArgProAsp 
GACGGCATCCTGCCCTGGGCCTACCAGCACAATGTCGCGCTCGTCTTCGACGAATACGATGCCGGCCGCCCGGAC 

1896 1906 1916 1926 1936 1946 1956 
ValMetPheVallleGlnArgValLeuGluSerSerGlyArgLeuThrLeuLeuAspGlnSerArgVallleArg 
GTCATGTTCGTCATCCAGCGCGTGCTGGAATCCTCCGGCCGCCTGACGCTGCTCGACCAGAGCCGTGTCATCCGT 

1971 1981 1991 2001 2011 2021 2031 
ProHisProAlaPheArgLeuPheAlaThrAlaAsnThrValGlyLeuGlyAspThrThrGlyLeuTyrHisGly 
CCGCACCCGGCCTTCCGCCTGTTTGCGACCGCCAACACCGTCGGCCTCGGCGACACGACCGGCCTCTATCACGGC 

2046 2056 2066 2076 2086 2096 2106 
ThrGlnGlnlleAsnGlnAlaGlnMetAspArgTrpSerlleValThrThrLeuAsnTyrLeuProHisAspLys 
ACGCAGCAGATCAACCAGGCGCAGATGGACCGCTGGTCGATCGTCACCACGCTGAACTACCTGCCGCACGACAAG 

2121 2131 2141 2151 2161 2171 2181 
GluValAspIleValAlaAlaLysValLysGlyPheThrAlaAspLysGlyArgGluThrValSerLysMetVal 
GAAGTCGACATCGTCGCCGCCAAGGTCAAGGGCTTCACCGCCGACAAGGGCCGCGAGACCGTCTCCAAGATGGTA 

2196 2206 2216 2226 2236 2246 2256 
ArgValAlaAspLeuThrArgAlaAlaPhelleAsnGlyAspLeuSerThrValMetSerProArgThrVallle 
CGTGTCGCCGACCTCACGCGCGCAGCCTTCATCAATGGCGATCTCTCGACTGTCATGAGCCCGCGTACGGTCATC 

2271 2281 2291 2301 2311 2321 2331 
ThrTrpAlaGluAsnAlaHisIlePheGlyAspIleAlaPheAlaPheArgValThrPheLeuAsnLysCysAsp 
ACCTGGGCCGAGAACGCCCACATCTTCGGCGACATCGCTTTCGCCTTCCGCGTGACCTTCCTCAACAAGTGCGAC 

2346 2356 2366 2376 2386 2396 2406 
GluLeuGluArgAlaLeuValAlaGluHisTyrGlnArgAlaPheGlylleGluLeuLysGluCysAlaAlaAsn 
GAGCTGGAGCGGGCGCTGGTCGCCGAGCACTACCAGCGCGCCTTCGGCATCGAGCTGAAGGAATGCGCTGCCAAC 

2421 2431 2441 2451 2461 2471 2481 
IleValLeuGluAlaThrAla*** 
ATCGTGCTCGAAGCCACCGCCTGA 

2496 2506 
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RESIDUES 








631 



MOLECULAR WEIGHT (MONOISOTOPIC) 
MOLECULAR WEIGHT (AVERAGE) 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 



70291.3984 
70334 .7656 
50.87 
5.10 



OD 260 (Img/ml) = 0.423 OD 280 (Img/ml) = 0.610 



2.40 



HYDROPHILICITY 
COBT FROM 1 TO 631 




-2.40 



63 126 189 252 315 378 441 504 567 630 

FIG. 40C 
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cobT GENE (SEQ ID NO: 33) AND COBT PROTEIN (SEQ ID NO: 44) 
SEQUENCE OF THE 4749-BP Sal I "Sal I "Sal I -Sal I "Sal I -MI FRAGMENT 

FROM 2616 TO 4511 

ValSerSerAsnSerLysAlaLysProThrThrArgGluAsnAlaAlaGluProPheLysArgAlaLeuSerGly 
GTGAGCTCGAATTCGAAGGCAAAGCCAACCACGCGCGAGAATGCTGCGGAACCGTTCAAGCGGGCGCTTTCCGGC 

2625 2635 2645 2655 2665 2675 2685 
CysIleArgSerlleAlaGlyAspAlaGluValGluValAlaPheAlaAsnGluArgProGlyMetThrGlyGlu 
TGCATCCGATCGATCGCGGGCGATGCCGAGGTGGAAGTCGCCTTCGCCAACGAGCGGCCGGGCATGACCGGCGAA 

2700 2710 2720 2730 2740 2750 2760 
ArglleArgLeuProGluLeuSerLysArgProThrLeuGlnGluLeuAlaValThrArgGlyLeuGlyAspSer 
CGCATCCGTCTGCCGGAACTTTCCAAGCGCCCGACCCTGCAGGAACTTGCCGTGACCCGCGGGCTCGGTGACAGC 

2775 2785 2795 2805 2815 2825 2835 
MetAlaLeuArgLysAlaCysThrHisAlaArglleGlnArgThrMetSerProGlnGlyAlaAspAlaArgAla 
ATGGCGCTGCGCAAGGCCTGTACGCATGCGCGGATCCAGCGCACCATGTCGCCGCAAGGGGCGGACGCCCGCGCG 

2850 2860 2870 2880 2890 2900 2910 
UePheAspAlaValGluGlnAlaArgValGluAlalleGlySerLeuArgMetAlaGlyValAlaLysAsnLeu 
ATCTTCGATGCGGTGGAGCAGGCTCGTGTCGAGGCGATCGGGTCGTTGCGCATGGCGGGTGTCGCCAAGAACCTC 

2925 2935 2945 2955 2965 2975 2985 
AsnValMetLeuGluGluLysTyrAlaLysAlaAsnPheAlaThrlleGluArgGlnAlaAspAlaProLeuGly 
AACGTCATGCTCGAAGAGAAATACGCCAAGGCGAATTTCGCAACGATCGAGCGCCAGGCGGACGCGCCGCTCGGC 

3000 3010 3020 3030 3040 3050 3060 
GluAlaValAlaLeuLeuValArgGluLysLeuThrGlyGlnLysProProAlaSerAlaGlyLysValLeuAsp 
GAGGCCGTAGCGCTGCTGGTGCGCGAGAAGCTGACGGGCCAGAAGCCGCCGGCGTCTGCCGGCAAGGTGCTCGAC 

3075 3085 3095 3105 3115 3125 3135 
LeuTrpArgGluPhelleGluGlyLysAlaAlaGlyAspIleGluHisLeuSerSerThrlleAsnAsnGlnGln 
CTCTGGCGCGAGTTCATCGAGGGCAAGGCTGCCGGCGACATTGAGCACCTGTCGTCGACGATCAACAACCAGCAG 

3150 3160 3170 3180 3190 3200 3210 
AlaPheAlaArgValValArgAspMetLeuThrSerMetGluValAlaGluLysTyrGlyAspAspAspAsnGlu 
GCCTTTGCCCGGGTCGTTCGCGACATGCTGACCTCGATGGAAGTCGCCGAGAAATACGGTGACGACGACAACGAG 

3225 3235 3245 3255 3265 3275 3285 
ProAspGluGlnGluSerGluThrAspGluAspGlnProArgSerGlnGluGlnAspGluAsnAlaSerAspGlu 
CCGGACGAGCAGGAAAGCGAGACCGACGAAGACCAGCCGCGCAGCCAGGAGCAGGACGAGAACGCCAGCGACGAG 

3300 3310 3320 3330 3340 3350 3360 
GluAlaGlyAspAspAlaAlaProAlaAspGluAsnGlnAlaAlaGluGluGlnMetGluGluGlyGluMetAsp 
GAAGCCGGCGACGATGCCGCACCCGCCGACGAGAACCAGGCTGCCGAAGAGCAGATGGAAGAAGGCGAGATGGAC 

3375 3385 3395 3405 . 3415 3425 3435 
GlyAlaGluIleSerAspAspAspLeuGlnAspGluGlyAspGluAspSerGluThrProGlyGluValLysArg 
GGCGCGGAGATCTCCGACGACGATCTCCAGGACGAAGGCGACGAGGACAGCGAAACGCCCGGCGAGGTCAAGCGT 

3450 3460 3470 3480 3490 3500 3510 
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ProAsnGlnProPheAlaAspPheAsnGluLysValAspTyrAlaValPheThrArgGluPheAspGluThrlle 
CCGAACCAGCCCTTCGCCGACTTCAACGAGAAGGTCGACTACGCCGTCTTCACCCGCGAGTTCGACGAGACGATT 

3525 3535 3545 3555 3565 3575 3585 
AlaSerGluGluLeuCysAspGluAlaGluLeuAspArgLeuArgAlaPheLeuAspLysGlnLeuAlaHisLeu 
GCCTCGGAAGAGCTTTGCGACGAGGCCGAGCTCGACCGGCTGCGCGCCTTCCTCGACAAGCAGCTTGCCCATCTT 

3600 3610 3620 3630 3640 3650 3660 
GlnGlyAlaValGlyArgLeuAlaAsnArgLeuGlnArgArgLeuMetAlaGlnGlnAsnArgSerTrpGluPhe 
CAAGGCGCGGTCGGCCGCCTTGCCAACCGGCTGCAGCGCCGCCTGATGGCGCAGCAGAACCGCTCCTGGGAGTTC 

3675 3685 3695 3705 3715 3725 3735 
AspLeuGluGluGlyTyrLeuAspSerAlaArgLeuGlnArgllellelleAspProMetGlnProLeuSerPhe 
GATCTCGAAGAGGGGTATCTCGATTCGGCGCGGCTTCAGCGCATCATCATCGATCCGATGCAGCCGCTTTCCTTC 

3750 3760 3770 3780 3790 3800 3810 
LysArgGluLysAspThrAsnPheArgAspThrValValThrLeuLeuIleAspAsnSerGlySerMetArgGly 
AAGCGCGAAAAGGACACCAACTTCCGCGATACCGTCGTGACGCTGCTGATCGACAATTCCGGCTCGATGCGCGGC 

3825 3835 3845 3855 3865 3875 3885 
ArgProIleThrValAlaAlaThrCysAlaAspIleLeuAlaArgThrLeuGluArgCysGlyValLysValGlu 
CGTCCGATCACGGTTGCCGCCACCTGCGCCGATATCCTGGCGCGCACGCTCGAGCGCTGCGGCGTCAAGGTCGAG 

3900 3910 3920 3930 3940 3950 3960 
IleLeuGlyPheThrThrLysAlaTrpLysGlyGlyGlnSerArgGluLysTrpLeuAlaGlyGlyLysProGln 
ATCCTCGGTTTTACCACCAAGGCGTGGAAGGGTGGGCAGTCACGCGAGAAGTGGCTGGCCGGCGGCAAGCCACAG 

3975 3985 3995 4005 4015 4025 4035 
AlaProGlyArgLeuAsnAspLeuArgHisIleValTyrLysSerAlaAspAlaProTrpArgArgAlaArgArg 
GCCCCGGGTCGCCTCAACGACCTGCGACACATCGTCTACAAGTCTGCCGACGCTCCGTGGCGCCGGGCACGACGC 

4050 4060 4070 4080 4090 4100 4110 
AsnLeuGlyLeuMetMetArgGluGlyLeuLeuLysGluAsnlleAspGlyGluAlaLeuIleTrpAlaHisGlu 
AATCTCGGCCTGATGATGCGGGAAGGCCTGCTCAAGGAAAACATCGACGGCGAGGCGTTGATTTGGGCGCATGAG 

4125 4135 4145 4155 4165 4175 4185 
ArgLeuMetAlaArgArgGluGlnArgArglleLeuMetMetlleSerAspGlyAlaProValAspAspSerThr 
CGGCTGATGGCGCGGCGCGAACAGCGGCGCATCCTGATGATGATTTCGGACGGCGCGCCGGTCGACGACTCGACG 

4200 4210 4220 4230 4240 4250 4260 
LeuSerValAsnProGlyAsnTyrLeuGiuArgHisLeuArgAlaVallleGluGlnlleGluThrArgSerPro 
CTGTCGGTCAATCCAGGAAACTATCTGGAGCGTCACCTGCGCGCGGTCATCGAGCAGATCGAAACGCGCTCGCCG 

4275 4285 4295 4305 4315 4325 4335 
ValGluLeuLeuAlalleGlylleGlyHisAspValThrArgTyrTyrArgArgAlaValThrlleValAspAla 
GTGGAACTGCTGGCGATCGGTATCGGCCACGACGTGACGCGCTACTATCGCCGTGCCGTCACCATCGTCGATGCC 

4350 4360 4370 4380 4390 4400 4410 
AspGluLeuAlaGlyAlaMetThrGluGlnLeuAlaAlaLeuPheGluAspGluSerGlnArgArgGlySerSer 
GATGAGCTTGCCGGCGCGATGACCGAACAGCTGGCCGCACTCTTCGAGGACGAAAGCCAGCGCCGCGGTTCTTCG 

4425 4435 4445 4455 4465 4475 4485 
ArgLeuArgArgAlaGly * * * 
CGTCTTCGCCGCGCCGGGTGA 

- - FIG.40E 
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RESIDUES 








93 



MOLECULAR WEIGHT (MONOISOTOPIC) 
MOLECULAR WEIGHT (AVERAGE) 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 



10279.2354 
10285. 6309 
48.39 
6.94 



OD 260 (Img/ml) = 0.411 OD 280 (Img/ml) = 0.541 

HYDROPHILICITY 

T COBX FROM 1 TO 93 

i . dU 




9 18 27 36 45 54 63 72 81 90 

F/G. 40F 
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cobX GENE (SEQ ID NO: 35) AND COBX PROTEIN (SEQ ID NO: 36) 
SEQUENCE OF THE 4749-BP MI "Ml "Sal I "Sal I -Sal I "Sal I FRAGMENT 

FROM 4089 TO 4370 



MetSerLeuThrGluThrlleGluLysLysLeuIleGluAlaPheHisProGluArgLeuGluVallleAsnGlu 
ATGTCGCTCACCGAGACCATCGAAAAGAAGCTGATCGAGGCCTTCCACCCTGAACGGCTCGAGGTCATCAACGAG 

4098 4108 4118 4128 4138 4148 4158 
SerHisGlnHisThrGlyHisGlnProGlyPheAspGlyThrGlyGluSerHisMetArgValArglleValSer 
AGCCATCAGCATACCGGCCATCAGCCGGGCTTCGATGGTACCGGCGAGTCCCACATGCGGGTGCGTATCGTTTCT 

4173 4183 4193 4203 4213 4223 4233 
SerAlaPheAlaGlyMetSerArgValAlaArgHisArgAlalleAsnAspLeuLeuLysProGluLeuAspAla 
AGCGCCTTTGCCGGCATGAGCCGTGTCGCCCGCCACCGCGCCATCAATGATCTCCTGAAGCCAGAACTCGACGCC 

4248 4258 4268 4278 4288 4298 4308 
GlyLeuHisAlaLeuAlaValGluProAlaAlaProGlyGluProThrArgTrp** 
GGCCTGCATGCGCTCGCCGTCGAGCCGGCAGCCCCCGGCGAGCCGACCCGCTGGTAG 

4323 4333 4343 4353 4363 



FIG. 40G 
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RESIDUES 








338 



MOLECULAR WEIGHT (MONOISOTOPIC) 
MOLECULAR WEIGHT (AVERAGE) 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 



34659.9844 
34681.9609 
34 .32 
6.21 



OD 260 (Img/ml) = 0.416 OD 280 (Img/ml) = 0.584 



2.10 



HYDROPHILICITY 
COBU FROM 1 TO 338 




-2.10 



33 66 99 132 165 198 231 264 297 330 

FIG. 41 A 
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cobU GENE (SEQ ID NO: 37) AND COBU PROTEIN (SEQ ID NO: 38) 
SEQUENCE OF THE 3855-BP Sstl-Sstl-BamHI FRAGMENT 
FROM 2099 TO 3115 

MetSerAlaSerGlyLeuProPheAspAspPheArgGluLeuLeuArgAsnLeuProGlyProAspAlaAlaAla 
ATGAGTGCCAGCGGCCTGCCGTTTGATGATTTTCGCGAATTGTTGCGCAACCTGCCGGGCCCGGATGCGGCAGCC 

2108 2118 2128 2138 2148 2158 2168 
LeuValAlaAlaArgGluArgAspAlaGlnLeuThrLysProProGlyAlaLeuGlyArgLeuGluGluIleAla 
CTCGTTGCCGCGCGGGAGCGGGACGCCCAGCTGACGAAGCCGCCGGGCGCGCTCGGCCGCCTCGAGGAAATCGCC 

2183 2193 2203 2213 2223 2233 2243 
PheTrpLeuAlaAlaTrpThrGlyLysAlaProValValAsnArgProLeuValAlallePheAlaGlyAsnHis 
TTCTGGCTCGCCGCCTGGACGGGCAAGGCGCCGGTGGTCAACCGGCCGCTGGTGGCGATCTTTGCCGGCAACCAC 

2258 2268 2278 2288 2298 2308 2318 
GlyValThrArgGlnGlyValThrProPheProSerSerValThrAlaGlnMetValGluAsnPheAlaAlaGly 
GGCGTCACCCGCCAGGGGGTGACCCCGTTCCCGTCATCCGTCACCGCACAGATGGTCGAGAATTTTGCCGCCGGT 

2333 2343 2353 2363 2373 2383 2393 
GlyAlaAlalleAsnGlnlleCysValSerHisAspLeuGlyLeuLysValPheAspLeuAlaLeuGluTyrPro 
GGCGCTGCGATCAACCAGATCTGCGTCAGCCACGACCTCGGGCTGAAGGTCTTCGACCTCGCACTCGAATACCCG 

2408 2418 2428 2438 2448 2458 2468 
ThrGlyAspIleThrGluGluAlaAlaLeuSerGluArgAspCysAlaAlaThrMetAlaPheGlyMetGluAla 
ACCGGTGATATCACCGAGGAAGCCGCGCTGTCCGAGCGCGATTGCGCCGCGACCATGGCCTTTGGCATGGAGGCG 

2483 2493 2503 2513 2523 2533 2543 
IleAlaGlyGlyThrAspLeuLeuCysIleGlyGluMetGlylleGlyAsnThrThrlleAlaAlaAlalleAsn 
ATTGCCGGCGGCACGGATCTTCTGTGCATCGGCGAAATGGGCATCGGCAACACCACGATCGCGGCCGCGATCAAT 

2558 2568 2578 2588 2598 2608 2618 
LeuGlyLeuTyrGlyGlyThrAlaGluGluTrpValGlyProGlyThrGlySerGluGlyGluValLeuLysArg 
CTCGGCCTTTATGGTGGCACGGCCGAAGAATGGGTCGGTCCGGGTACCGGCTCCGAGGGCGAGGTGCTGAAGCGC 

2633 2643 2653 2663 2673 2683 2693 
LysIleAlaAlaValGluLysAlaValAlaLeuHisArgAspHisLeuSerAspProLeuGluLeuMetArgArg 
AAGATCGCCGCGGTCGAAAAGGCCGTGGCGCTGCATCGCGATCACCTGTCCGATCCGCTCGAACTGATGCGTCGC 

2708 2718 2728 2738 2748 2758 2768 
LeuGlyGlyArgGluIleAlaAlaMetAlaGlyAlalleLeuAlaAlaArgValGlnLysValProValllelle 
CTCGGCGGTCGTGAGATCGCGGCCATGGCTGGCGCCATCCTGGCCGCCCGCGTCCAGAAGGTACCTGTCATCATC 

2783 2793 2803 2813 2823 2833 2843 
AspGlyTyrValAlaThrAlaAlaAlaSerlleLeuLysAlaAlaAsnProSerAlaLeuAspHisCysLeuIle 
GACGGCTACGTGGCGACCGCTGCGGCTTCGATCCTGAAGGCGGCCAACCCGTCGGCCCTCGACCATTGCCTGATC 

2858 2868 2878 2888 2898 2908 2918 
GlyHisValSerGlyGluProGlyHisLeuArgAlalleGluLysLeuGlyL.ysThrProLeuLeuAlaLeuGly 
GGCCATGTTTCGGGCGAACCGGGGCATCTGCGCGCGATCGAGAAGCTCGGCAAGACGCCGCTGCTGGCACTCGGC 

2933 2943 2953 2963 2973 2983 2993 
MetArgLeuGlyGluGlyThrGlyAlaAlaLeuAlaAlaGlylleValLysAlaAlaAlaAlaCysHisSerGly 
ATGCGGCTTGGCGAAGGCACGGGCGCGGCCCTTGCCGCCGGTATCGTCAAGGCGGCGGCCGCTTGCCACAGCGGC 

3008 3018 3028 3038 3048 3058 3068 
MetAlaThrPheAlaGlnAlaGlyValSerAsnLysGlu*** 
ATGGCGACCTTTGCCGAGGCCGGCGTCAGCAACAAGGAATAG 

3083 3093 3103 3113 

FIG. 41B 
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NAME = COBV 



FIRST RESIDUE 
LAST RESIDUE 











NU . -6 


1 


PHE 


F 


18 


5. 96 


2 


LEU 


L 


39 


12.91 


3 


ILE 


I 


13 


4.30 


4 


MET 


M 


10 


3.31 


5 


VAL 


V 


23 


7.62 


6 


SER 


S 


18 


5. 96 


7 


PRO 


P 


12 


3. 97 


8 


THR 


T 


10 


3.31 


9 


ALA 


A 


63 


20.86 


10 


TYR 


Y 


3 


0.99 


11 






0 


0.00 


12 


HIS 


H 


3 


0.99 


13 


GLN 


Q 


6 


1 . 99 


14 


ASN 


N 


2 


0.66 


15 


LYS 


K 


5 


1.66 


16 


ASP 


D 


10 


3.31 


17 


GLU 


E 


7 


2.32 


18 


CYS 


C 


3 


0.99 


19 


TRP 


W 


2 


0.66 


20 


ARG 


R 


19 


6.29 


21 


GLY 


G 


36 


11.92 


22 






0 


0.00 



41 
57 
58 
63 



1 

= 302 
WEIGHT 

2647.23 

4410.28 

1470.09 

1310 

2278 

1566 

1164 

1010.48 

4475.34 
489.19 
0.00 
411.18 
768.35 
228.09 
640.47 

1150.27 
903.30 
309 
372 

2965 

2052 



WEIGHT 

8.64 
14.39 
4 
4 
7 
5 



80 
28 
44 
11 



03 
16 
92 
77 
0.00 



3.80 
3.30 
14.61 
1.60 
0.00 
1.34 
2.51 
0.74 
2.09 
3.75 
2.95 
1.01 
1.21 
9.68 
6.70 
0.00 



RESIDUES 

MOLECULAR WEIGHT (MONOISOTOPIC) 
MOLECULAR WEIGHT (AVERAGE) 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 



302 

30642.3359 
30662.0820 
26.49 
9.97 



OD 260 (Img/ml) = 0.391 OD 280 (Img/ml) = 0.479 

HYDROPHILICITY 
COBV FROM 1 TO 302 




-1.80 



30 60 90 120 150 180 210 240 270 300 
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cobV GENE (SEQ ID NO: 39) AND COBV PROTEIN (SEQ ID NO: 40) 
SEQUENCE OF THE 3855-BP BamHI-Sstl-SstI FRAGMENT • 
FROM 1885 TO 2793 

MetLysGlyLeuGlyLysValCysAlaAlaLeuAlaGlyAlaProAlaArgLeuAlaArgLeuAlaAlaValGly 
ATGAAAGGTCTGGGAAAAGTGTGCGCGGCCCTCGCGGGTGCGCCCGCCCGGCTTGCGAGACTCGCCGCAGTCGGC 

1894 1904 1914 1924 1934 1944 1954 
GluPheGlySerGlyArgValAsnGlyArgAsnLeuGlyGlyAlaMetGlyPheValGlyAspPheCysAspAsp 
GAGTTTGGCAGCGGCCGGGTCAATGGCCGCAACCTTGGAGGAGCCATGGGCTTCGTTGGCGATTTCTGTGACGAT 

1969 1979 1989 1999 2009 2019 2029 
ValAlaArgSerlleGlyPheLeuSerArglleProMetProAlaArgHisPheGluGlyTyrAspGlyArgLeu 
GTGGCGCGGTCGATCGGCTTTCTCAGCCGCATCCCGATGCCTGCGCGTCATTTCGAAGGTTATGACGGGCGCTTG 

2044 2054 2064 2074 2084 2094 2104 
SerArgAlaValArgAlaPheProPheAlaGlyLeuAlalleAlaLeuProSerAlaAlaValAlaMetAlaLeu 
AGCCGTGCCGTGCGCGCCTTCCCGTTCGCCGGCCTGGCGATCGCACTGCCATCGGCGGCCGTCGCCATGGCCCTC 

2119 2129 2139 2149 2159 2169 2179 
MetAlaLeuGlnValSerSerLeuPheAlaAlaPheValValValAlalleGlnAlaLeuValThrGlyAlaLeu 
ATGGCGCTGCAGGTGAGCTCGCTCTTTGCCGCCTTTGTCGTCGTCGCCATCCAGGCGCTTGTCACCGGCGCACTG 

2194 2204 2214 2224 2234 2244 2254 
HisGluAspGlyLeuGlyAspThrAlaAspGlyPheGlyGlyGlyArgAspArgGluAlaAlaLeuAlalleMet 
CACGAAGACGGGCTTGGCGACACGGCCGATGGCTTTGGCGGCGGTCGCGACCGCGAAGCGGCGCTTGCGATCATG 

2269 2279 2289 2299 2309 2319 2329 
LysAspSerArglleGlyThrTyrAlaAlaValAlaLeuIleLeuSerPheGlyLeuArgValSerAlaPheAla 
AAGGACAGCCGCATCGGCACCTATGCGGCCGTCGCCCTCATCCTCTCCTTCGGCCTGCGCGTCTCGGCGTTCGCC 

2344 2354 2364 2374 2384 2394 2404 
SerlleLeuProLeuPheSerProLeuGlyAlaAlaMetAlalleLeuGlyAlaAlaCysLeuSerArgAlaAla 
TCGATCCTGCCGCTCTTTTCGCCGCTGGGCGCCGCCATGGCGATCCTTGGCGCAGCGTGCCTCAGCCGCGCCGCC 

2419 2429 2439 2449 2459 2469 2479 
MetValTrpHisTrpSerSerLeuProProAlaArgSerSerGlyValAlaAlaSerAlaGlyGluProGluPro 
ATGGTCTGGCACTGGTCGTCGCTTCCGCCGGCGCGCAGCAGCGGCGTCGCGGCCTCGGCTGGCGAGCCGGAACCG 

2494 2504 2514 2524 2534 2544' 2554 
AlaAlaThrArgPheAlaLeuAlaPheGlyLeuLeuValAlaMetLeuLeuPheTyrLeuAlaGlnValProAla 
GCGGCCACGCGCTTCGCGCTTGCCTTCGGACTTCTTGTCGCAATGCTGCTCTTCTATCTCGCGCAGGTCCCGGCG 

2569 2579 2589 2599 2609 2619 2629 
LeuGlyValI^eAlaAlaLeuValAlaPheLeuAlaThrValLysGlyPheAlaArgLeuAlaMetArgLysIle 
CTCGGCGTCATCGCGGCGCTCGTCGCATTCCTTGCCACGGTCAAGGGCTTCGCTCGGCTGGCGATGCGCAAGATC 

2644 2654 2664 2674 2684 2694 2704 
GlyGlyGlnThrGlyAspThrlleGlyAlaThrGlnGlnLeuThrGluIleAiaValLeuGlyAlaLeuAlaLeu 
GGCGGACAAACGGGCGACACGATCGGGGCGACGCAGCAACTGACCGAAATCGCCGTGCTCGGTGCCCTTGCGCTG 

2719 2729 2739 2749 2759 2769 2779 
ThrVal*** 
ACGGTTTGA 
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SEQUENCE 


LENGTH 


= 13144 FRO 


1 n 


on 




GAGCTCGAAG 


GGGCTTCCGC 


CCCGATCGCT 


CTCGAGCTTC 


CCCGAAGGCG 


GGGCTAGCGA 


/u 


on 


on 


CGCCGAGCGG 


GCCGAAGGGC 


GCGTCGACGA 


GCGGCTCGCC 


CGGCTTCCCG 


CGCAGCTGCT 


i jU 


i4U 


iDU 


GAACCTTCGA 


GTTCCAGGCG 


ATCTGAACGA 


CTTGGAAGCT 


CAAGGTCCGC 


TAGACTTGCT 




ZUU 


Tin 
zlO 


ACATGAACCT 


TGAGAGGCCG 


GAGGCCTATC 


TGTACTTGGA 


ACTCTCCGGC 


CTCCGGATAG 


Z jU 


ZdU 


OTn 
Z /U 


AGGTGTGCGC 


TGCAAAAAAT 


TGAATGCCAA 


TCCACACGCG 


ACGTTTTTTA 


ACTTACGGTT 






330 


CGGCCGCGAC 


ATTTTCGACA 


AGCCTTGCGA 


GCCGGCGCTG 


TAAAAGCTGT 


TCGGAACGCT 


0 /U 


joO 


0 nn 
390 


TCAATTGCGG 


CGAAATCGTG 


TCGAAACAGA 


AGTTAACGCC 


GCTTTAGCAC 


AGCTTTGTCT 


/ion 


440 


450 


ATGGCCGCAT 


GACACGCAGG 


ATCATGTTGC 


TACCGGCGTA 


CTGTGCGTCC 


TAGTACAACG 


490 


500 


510 


TATTGGTGGC 


GGGGCTCTGC 


CGGCTTGCCG 


ATAACCACCG 


CCCCGAGACG 


GCCGAAGGGC 


550 


560 


570 


AGCCGCAGAA 


CATGTCGAAC 


AACGCCGCCG 


TCGGCGTCTT 


GTACAGCTTG 


TTGCGGCGGC 



F/G. 



1 TO 13144 



40 


50 


60 


GGCGTTAGCC 


GACGTTCGAC 


GTGCGGATGA 

\J X V./ \J \JL 1 X VJx X 


CCGCAATCGG 


CTGCAAGCTG 


CACGCCTACT 


100 


110 


120 


CGAGGTTGCG 


TACGCGCGAC 


TGGCTGGACG 


GCTCCAACGC 


ATGCGCGCTG 


ACCGACCTGC 


160 


170 


180 


AATTGGGCTT 


GCTGAAAATA 


TACAGCATGG 

X xxV^xlVJ V./rx X VJ VJ 


TTAACCCGAA 


CGACTTTTAT 


ATGTCGTACC 


220 


230 


240 

^ X V/ 


CTCCGGGGCG 


TGTTGCTATG 

X \J X X VJ V^ X XX X VJ 


CCGCTGATAT 

V/ V,/ VJV/ X VJXx X xX X 


GAGGCCCCGC 


ACAACGATAC 


GGCGACTATA 


280 


290 


300 

>j \j \j 


ACTCGCCACG 


CCATGTCGCA 

Vy \^ X X X VJ X \y VJ V^xi 


TTCTGGCTAT 

X XV^XVJVJV^XxxX 


TGAGCGGTGC 


GGTACAGCGT 


AAGACCGATA 


340 


350 


360 


AAGCGCGAAA 


CAATGCGTGA 

N^XxxxX \JV^VJ X VJxx 


AAGGGCTTTG 

XiXiVJVJVJV/ X X X VJ 


TTCGCGCTTT 


GTTACGCACT 


TTCCCGAAAC 


400 

X W 


410 


420 


CCTTTGCCGC 

X. X. X. \J\^\<r\J\^ 


TGCCCGTTTC 


AGTGTTArrG 

nvj X vj X X nv./ wj 


GGAAACGGCG 


ACGGGCAAAG 


TCACAATGGC 


460 


47(1 


T O U 


AGGGAAPfGR 

nvj VJ vjnn V v-' vj vj 


rTfGGATfiTr 

X wwri 1 o X 


nnAAAATrnr; 






K^K^l X i ifib^V^ 


520 


530 


540 


CCAATCAGGG 


CCTGAAGGTC 


CGGCCGTTCA 


GGTTAGTCCC 


GGACTTCCAG 


GCCGGCAAGT 


580 


590 


600 


TTTCCGACGA 


CGGCGGCGAC 


ATCGGCCGCG 


AAAGGCTGCT 


GCCGCCGCTC 


TAGCCGGCGC 
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Di U 




D JU 






boU 






crrrrrrrrr 






A hrrrrrrrr 
AAbbbbblbb 


GCGTTACCGA 


CGTCCGCGAC 


CGGCGCGCGC 


ACGGCAGCAG 


CCACGTGTAC 


TTGGGCCACG 


D / U 


DOU 




100 


71 0 


TOO 




bLAbibbbnb 


bibbbbnbbb 


APATPPTPPT 
/ib/iibbibbi 


TpappppALP 

i bAbbbbAAb 


rrrrrrrrrr 

blbbbbbbbb 


AGGACTTCGG 


CGTCAGCCTG 


CACCCGTCGG 


TCTAGCAGCA 


AGTCCCGTTC 


CAGCGGCCCG 


1 jU 


1 A(\ 
/ 4U 




1 du 


110 

1 /u 


IQO 
/OU 


iibbLL/lbbbb 


bbbbbAAlAi 


LAbbbbblLA 


APPPPA APPT 
AbbbLAAbb 1 


bbibbbbbbb 


PTPATPPAPA 
bibAlbbAbA 


TCCGGTCCCC 


CGCCCTTATA 


GTCCGCGAGT 


TCGGGTTCGA 


CGACCCGCGG 


CAGTACCTCT 


ion 


oUU 


oiU 






QAO 
04U 


pmmmppTv TV p TV 

bi i ibbAALA 


A AT* A TPPPPP 

AAiAiLbbbb 


bbibLLbAib 


mppmppmppm 

iLblbbibbi 


PP A A rrrrrr 
bbAAbbLbbb 


rrrrrrrrrr 
bbbibbbbbb 


CAAAGCTTGT 


TTATAGCCGG 


CCACGGCTAG 


AGCACCACCA 


GCTTCCGCGG 


CCGAGCGGCC 


ojU 


ouU 


0 / U 


ooU 


oyu 


yuu 


PPPA A ATP A A 


bbiLAbbbbL 


bbLbALAiLb 


PP A A TA TPPP 

LLAAlAlbbb 


p m rn rp p p p 7\ p TV 

b i i i bbbAbA 


rrrrrrji atp 
bbbbbbAAib 


GGCTTTAGTT 


GGAGTCCGGG 


CCGCTGTAGC 


GGTTATACCC 


GAAACGCTGT 


GCCCGGTTAC 






yju 


y^u 


you 


ybU 


ibLbbbiLbi 


bbibbiLbbb 


bALAibbALt 


bbbbbbbbbi 


r'hn^rrrrwrr 
bAibbbbibb 


bibbibbbbA 


ACGGCCAGCA 


CGACCAGCCG 


CTGTAGCTGG 


CGCCCCCCCA 


CTAGCGGAGC 


GACCAGCCGT 


y /u 


yoU 


yyu 


iUUU 


1 ni n 
iUiU 


1 non 
iUzU 


PPP ATPPP A T 

tbLAibLbAi 


LLibbbtbAb 


bAAbAbbbbL 


bbAibbibAb 


rrrrTTi^rfvr 
bbbbiAiblb 


A fPP A APA APT 

AibAAbAAbi 


GCGTACGCTA 


GGACGGGCTC 


CTTCTGGCCG 


CGTACCACTG 


GCCGATAGAG 


TAGTTGTTCA 




iU4U 


1 0 fi 

iUoU 


iUbU 


iU /U 


iUoU 


iLLbLbbLbA 


pomp TV r^nn^r^ 
LblLALbLlb 


ilLbACbALb 


OOA ^^T^OO^^oo 

bbArTbCTbC 


OOrnOT\ "AOOOO 

CbTCAAbbbb 


lyTV OTV'OOOOOrn 

TACAbbbbbT 


AGGCGCCGCT 


GCAGTGCGAC 


AAGCTGCTGC 


OOmTV "TV OOTV OO 

CGTAACGACG 


ooTv ommoooo 

GCAGTTGGCG 


TV momoooooTi 

ATGTGGCCGA 


1090 


1100 


1110 


1120 


1130 


1140 


GGCCCTGCTT 


CGGCGTCGTG 


CCGTGGCTGA 


AGGCGGCGGC 


ACGCCTGCCG 


GCGGAAGATT 


CCGGGACGAA 


GCCGCAGCAC 


GGCACCGACT 


TCCGCCGCCG 


TGCGGACGGC 


CGCCTTCTAA 


1150 


1160 


1170 


1180 


1190 


1200 


CCGTCGTGCT 


GGAGAAGCTG 


ACGCGCGGCG 


AGGGGCGGGC 


GCTGAAGGTT 


GCCGTCCCGG 


GGCAGCACGA 


CCTCTTCGAC 


TGCGCGCCGC 


TCCCCGCCCG 


CGACTTCCAA 


CGGCAGGGCC 
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1210 1220 1230 

TACTGTCGCG CATCGCCAAT TTCGACGACC 
ATGACAGCGC GTAGCGGTTA AAGCTGCTGG 

1270 1280 1290 

ATCTCGTCTT CGTGCGGCCT GGCAGTCCCA 
TAGAGCAGAA GCACGCCGGA CCGTCAGGGT 

1330 1340 1350 

CCGGGTCGAA ATCGACCATC GGCGACCTCA 
GGCCCAGCTT TAGCTGGTAG CCGCTGGAGT 

1390 1400 1410 

ACCTCGAACG TCATGTGCGC CGGGGCGGCC 
TGGAGCTTGC AGTACACGCG GCCCCGCCGG 

1450 1460 1470 

TGCTCGGCCG GCGCGTCACC GATCCGCTCG 
ACGAGCCGGC CGCGCAGTGG CTAGGCGAGC 

1510 1520 1530 

GCCTCGGGCT GCTCGAGGTC GAGACCGAGA 
CGGAGCCCGA CGAGCTCCAG CTCTGGCTCT 

1570 1580 1590 

GCGCCTGGTC GCTGGAGCAT GATGTGGTGC 
CGCGGACCAG CGACCTCGTA CTACACCACG 

1630 1640 1650 

CGCAAGGTGC GGACTGTGGC CGGCCGTCGG 
GCGTTCCACG CCTGACACCG GCCGGCAGCC 

1690 1700 1710 

TTTCGGCCGA TGGCCGCGTG ATGGGCACCT 
AAAGCCGGCT ACCGGCGCAC TACCCGTGGA 

1750 1760 1770 

ATCGCGGCGC GCTGCTCAAG AGTTTCGGCA 
TAGCGCCGCG CGACGAGTTC TCAAAGCCGT 





1 ?sn 

1 ^ J u 


1 9^n 

IZdU 






ppppAPaTTp 

bbbb/\bHi lb 


AGCTAGGCGA 


GCGGCGGCTT 


GGCCTCTAAC 


1 inn 




i JZU 






bibbibni ib 


AAGGCCAGCT 


GCGACCGGAG 


CAGCAGTAAG 




1 J / u 


1 JOU 




i bbbb/inbbb 


ibbbAbbbib 


AGCTAAAGGC 


ACGCGTTCCC 


ACCCTGGCAC 




1 4 jU 


^ AAC\ 




bHibibbbbb 


bbbi/ibbAbA 


CCCAGTAGCC 


GTAGACGCCG 


CCGATGGTCT 


1 4ftn 




i jUu 




bbbbb/lfibbi 


bbbbibbAbb 


CGTAGCTCCC 


GCCGCTTGCA 


CGCCAGCTCC 


1 R4n 


1 J jU 


iJOU 




2va]\papppTP 

Hil/lbAbbbib 


bbbAAb/ibbb 


ACCGCGGCCT 


TTTCTGCCAC 


GCGTTGTCGG 


i DUU 


iDiU 






ppaaaTppaT 

bbfwiibb/ii 


bi ibbbAAbA 


AGCTTCCGAT 


GCTTTAGGTA 


GAACCGTTCT 


1 DOU 


ib /u 


iboU 






bAbbbbbbbb 


ALbLbiAbLi 


P fP T 71 r" rTTT" P 

biiAGLGtbG 


CTGCCGCGGG 


1720 


1730 


1740 


ACCTGCATGG 


GCTCTTCACC 


AGCGACGCCT 


TGGACGTACC 


CGAGAAGTGG 


TCGCTGCGGA 


1780 


1790 


1800 


TCGAAGGCGG 


CGCCAACAAC 


TACCGCCAAT 


AGCTTCCGCC 


GCGGTTGTTG 


ATGGCGGTTA 
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1 Q1 n 


ioZU 


io jU 


1 p/in 


ioDU 


iooU 


LbbitbAibL 


bbbbbibb/iL 




Abb/lftLibb/\ 


bbbibibbib 


bAibblbbbi 


GCCAGCTACG 


CCGCGAGCTG 


CTACAGCGCT 


TGCTTGACCT 


CCGACACGAG 


CTAGCAGCGA 


io /U 


1 QQfl 
iooU 




1 QDO 

iyuu 


1 Q1 n 


1 Q9n 




bi ibLibAbb 


bALiAbbbAb 


bbbbbAAbbb 


ibAbbbnbbA 


bbibbbbiAb 


CCGACCTGCT 


CAACGAGTCC 


GTGATCCCTG 


CGCCGTTGCC 


AGTCGGTCGT 


CCAGGCCATG 




ly^u 


lyou 


iybu 


ly /u 


iyoU 


OfprTT'PPPP TV 


AP7VPPAPP7\ 7\ 

ALAbbAbLAA 


bbAbbl lAib 


rn^rrn^ apt* 
LbAbbbAAbi 


Abbb i bbbAb 


TV rppprpppmpp 

Aibblbblbb 


CAGCCCGGGT 


TGTCCTCGTT 


GCTCGAATAG 


GCTGCCTTGA 


TGCGACGCTG 


TAGCACGAGG 




0 n n n 
2UUU 


0 n 1 n 
ZUlU 


0 no n 
2uzU 


20j0 


Of) AO 


TCGCTTGCGG 


CTTCCCAGAt 


rpmoopPPPPP 

TTCbbbbbbb 


PP A TPP A Pprp 

bbAlbbAbbi 


ibAlbAbbbb 


AAibbbbAbb 


AGCGAACGCC 


GAAGGGTCTG 


AAGGGCGCGC 


CGTAGGTCCA 


AGTAGTCCCG 


TTAGGGGTCC 


on CO 

zuoO 


zOdU 


om n 
2U /U 


0 n 0 n 
2UoU 


2uyu 


21UU 


CCGACGATCA 


rTTnrT'pppp'A 

GGTCCGbCCA 


nnnnnTi pfppp 
bbbbbAbibL 


PAP A P A T A PP 

CAbAbAiAbb 


biblbbbbAb 


Abbbbbbbbb 


GGCTGCTAGT 


CCAGGCCGGT 


CCGGCTGACG 


GTGTCTATCC 


GACAGCGGTC 


TGGGCGCCGC 


Olio 

2110 


0 1 o n 

2120 


zloO 


2140 


01 CO 


01 ad 


TV rno TV m/^/^oo 7v 

ATGATGGCCA 


CATTGGCGAA 


GGCATCGTTG 


CGGGCCGAGA 


PA A A rnr*r*n}n/^ 
bAAATbblbb 


bbbbbibAbb 


TACTACCGGT 


GTAACCGCTT 


CCGTAGCAAC 


GCCCGGCTCT 


CTTTACGACG 


GGCGCACTCG 


O 1 T A 

2170 


O 1 o n 

2180 


01 nn 

21yU 


0 0 n n 
2200 


0 0 1 n 
2210 


ooon 
222U 


GTGCCGCTCG 


TGTGACGGTA 


GGCGACGAGC 


TV r'Tv rriTv rr^r^rr^ 

AGATAbbCGC 


TV TV r^APPrnrp 
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2350 


2360 


2370 
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CATGTTGACG 
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2410 2420 2430 2440 2450 2460 

GAGGCGTCTT CGAGGAAGTC GACGCTGTCG GCCATGAGGG ACACCGAGCC GATCGAAAGC 

CTCCGCAGAA GCTCCTTCAG CTGCGACAGC CGGTACTCCC TGTGGCTCGG CTAGCTTTCG 

2470 2480 2490 2500 2510 2520 

GCGACAAGGA GTTCGACCCC GAAATAGCCA AGGTTCAACA GGGAGACGAT GAGGACGACG 

CGCTGTTCCT CAAGCTGGGG CTTTATCGGT TCCAAGTTGT CCCTCTGCTA CTCCTGCTGC 

2530 2540 2550 2560 2570 2580 

CGGCGCAGGT CGGTATCCAC TCGAAAGGTT CCCTTTCTGG CGAGATTCGC CCTCGGCACT 

GCCGCGTCCA GCCATAGGTG AGCTTTCCAA GGGAAAGACC GCTCTAAGCG GGAGCCGTGA 



2590 2600 
TTTTTTGGCG AGATTCGCCC 
AAAAAACCGC TCTAAGCGGG 



2610 2620 
TCGGCACTTT GGCACAGGTG 
AGCCGTGAAA CCGTGTCCAC 



2630 2640 
TTAGCAGCAG TTTGCTATCC 
AATCGTCGTC AAACGATAGG 



2650 2660 2670 2680 2690 2700 

ATAGCACTAG GTTTCGACAT CGGTTCCGTT CACACTGCCG TCGTGCCTGA CGCCCGACAA 
TATCGTGATC CAAAGCTGTA GCCAAGGCAA GTGTGACGGC AGCACGGACT GCGGGCTGTT 

2710 2720 2730 2740 2750 2760 

ATCGTCGCGT GGCGCAACTC GGCCGGGGAG GCGTCGCATG CGTCGATTGA CTTTGGGCTG 
TAGCAGCGCA CCGCGTTGAG CCGGCCCCTC CGCAGCGTAC GCAGCTAACT GAAACCCGAC 

2770 2780 2790 2800 2810 2820 

CCCGCTTCCT AATCATCAGG TGTTGGATGG TTCCCCCTTG TCGTGGCGAT CTGGGGGAAT 
GGGCGAAGGA TTAGTAGTCC ACAACCTACC AAGGGGGAAC AGCACCGCTA GACCCCCTTA 

2830 2840 2850 2860 2870 2880 

AATTGGGAAT GTGACGGATG GACCCAAATC GGGCATCCTT ATCGCAGCCG ACCCCGCGAC 
TTAACCCTTA CACTGCCTAC CTGGGTTTAG GCCGTAGGAA TAGCGTCGGC TGGGGCGCTG 

2890 2900 2910 2920 2930 2940 

TGTAGAACGG TCAGGGTTCG CCATCGGGAT TGGTGCCGGG CTGTCGGCCG GTTGCATGGG 
ACATCTTGCC AGTCCCAAGC GGTAGCCCTA ACCACGGCCC GACAGCCGGC CAACGTACCC 

2950 2960 2970 2980 2990 3000 

CAATCGGGGC AGGTCGGGGA TCAAGCCGGA AAAGCCACTG GCGTGGCATC GTGATCAGCC 
GTTAGCCCCG TCCAGCCCCT AGTTCGGCCT TTTCGGTGAC CGCACCGTAG CACTAGTCGG 
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3010 3020 3030 3040 3050 3060 

GGGTTTGGAC GCCTCTTCTT CTACGAATCG TCCGCCTTTC ACGATGTCCC TCACAGCGCC 
CCCAAACCTG CGGAGAAGAA GATGCTTAGC AGGCGGAAAG TGCTACAGGG AGTGTCGCGG 

3070 3080 3090 3100 3110 3120 

CATGCGTCGG AGACGACGCG CAAAGGTTCG CTGTGGCACC GGAAAGACGC CGGGAAGGTG 
GTACGCAGCC TCTGCTGCGC GTTTCCAAGC GACACCGTGG CCTTTCTGCG GCCCTTCCAC 

3130 3140 3150 3160 3170 3180 

AGGCGGGCCG CTCGGGCCCT GACATCGGAA CCTTGCCGTT TAAGGGCGAG GCGATGTTCG 
TCCGCCCGGC GAGCCCGGGA CTGTAGCCTT GGAACGGCAA ATTCCCGCTC CGCTACAAGC 

3190 3200 3210 3220 3230 3240 

GCCCGTGACG CCGTGAGCCA GGAGACCTGC CATCCGGCAT GGGCATTCCG CCCGAGGGGA 
CGGGCACTGC GGCACTCGGT CCTCTGGACG GTAGGCCGTA CCCGTAAGGC GGGCTCCCCT 

3250 3260 3270 3280 3290 3300 

CTTTTGTCTC CAACGCCATC ACGGAGGTTG TTTTGGCTCG CAGATGTTTT CAAGAACGCG 
GAAAACAGAG GTTGCGGTAG TGCCTCCAAC AAAACCGAGC GTGTACAAAA GTTCTTGCGC 

3310 3320 3330 3340 3350 3360 

CCCGTGGCGC GTCCGATGGC TTTTGCCACC GACGGCTGAT TTGGGAATGT TGAGGCAGCC 
GGGCACCGCG CAGGCTACCG AAAACGGTGG CTGCCGACTA AACCCTTACA ACTCCGTCGG 

3370 3380 3390 3400 3410 3420 

ACGATGAGCA GTCTCAGCGC CGGGCCCGTG CTGGTCCTTG GCGGCGCCCG TTCCGGCAAG 
TGCTACTCGT CAGAGTCGCG GCCCGGGCAC GACCAGGAAC CGCCGCGGGC AAGGCCGTTC 

3430 3440 3450 3460 3470 3480 

TCCAGCTTTT CCGAGAGGCT CGTCGAAGCG TCCGGCTTCA CCATGCATTA TGTCGCCACG 
AGGTCGAAAA GGCTCTCCGA GCAGCTTCGC AGGCCGAAGT GGTACGTAAT ACAGCGGTGC 

3490 3500 3510 3520 3530 3540 

GGCCGCGCCT GGGACGACGA AATGCGCGAG CGCATCGACC ATCACCGGAC GCGCCGCGGC 
CCGGCGCGGA CCCTGCTGCT TTACGCGCTC GCGTAGCTGG TAGTGGCCTG CGCGGCGCCG 

3550 3560 3570 3580 3590 3600 

GAGGGCTGGA CGACGCATGA GGAGCCGCTC GATCTCGTCG GCATCCTCAG ACGCATCGAT 
CTCCCGACCT GCTGCGTACT CCTCGGCGAG CTAGAGCAGC CGTAGGAGTC TGCGTAGCTA 
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4210 4220 4230 

ATCATCGAAA CCTCGGGCCT TGCCCTGCCG 
TAGTAGCTTT GGAGCCCGGA ACGGGACGGC 

4270 4280 4290 

GATATCCGCA GCGAAGTGAC CGTCGATGGC 
CTATAGGCGT CGCTTCACTG GCAGCTACCG 

4330 4340 4350 

GCCGCTGGCC GCTTTGCCGA CGACCACGAC 
CGGCGACCGG CGAAACGGCT GCTGGTGCTG 

4390 4400 4410 

AATCTCGATC ACGAAAGCCC GATCGAGGAG 
TTAGAGCTAG TGCTTTCGGG CTAGCTCCTC 

4450 4460 4470 

CTCATCGTTC TCAACAAGAC CGATCTGATC 
GAGTAGCAAG AGTTGTTCTG GCTAGACTAG 

4510 4520 4530 

GAGGTGTCTT CGCGCACCAG CCGCAAGCCC 
CTCCACAGAA GCGCGTGGTC GGCGTTCGGG 

. 4570 4580 4590 

GCCGCTGCCA TCCTGCTTGG CCTCGGTGTC 
CGGCGACGGT AGGACGAACC GGAGCCACAG 

4630 4640 4650 

TCGCATCACG AGATGGAGCA CGAGGCAGGT 
. AGCGTAGTGC TCTACCTCGT GCTCCGTCCA 

4690 4700 4710 

TTCGTCGTCG AGCTCGGTTC GATCGCCGAT 
AAGCAGCAGC TCGAGCCAAG CTAGCGGCTA 

4750 4760 4770 

GTAATCGCGG AGCACGACGT TCTGCGCCTC 
CATTAGCGCC TCGTGCTGCA AGACGCGGAG 

FIG. 
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4810 4820 4830 4840 4850 4860 

ATGCGCCTCC TGATCCAGGC GGTCGGCGCC CGCATCGACC AATATTACGA CCGCGCCTGG 
TACGCGGAGG ACTAGGTCCG CCAGCCGCGG GCGTAGCTGG TTATAATGCT GGCGCGGACC 

4870 4880 4890 4900 4910 4920 

GGCGCTGGCG AAAAGCGCGG TACGCGCCTC GTCGTCATCG GCCTGCACGA CATGGACGAG 
CCGCGACCGC TTTTCGCGCC ATGCGCGGAG CAGCAGTAGC CGGACGTGCT GTACCTGCTC 

4930 4940 4950 4960 4970 4980 

GCGGCGGTGC GCGCCGCGAT CACCGCGCTC GTGTAGATCG TTCTTTGAAT GAAATGATCT 
CGCCGCCACG CGCGGCGCTA GTGGCGCGAG CACATCTAGC AAGAAACTTA CTTTACTAGA 

4990 5000 5010 5020 5030 5040 

AACGCATTGA AATGATGCAG TTCCGGATGG AGAACGCTTT TAGCGTTTTC GTTCGGAATT 
TTGCGTAACT TTACTACGTC AAGGCCTACC TCTTGCGAAA ATCGCAAAAG CAAGCCTTAA 

5050 5060 5070 5080 5090 5100 

GCCCCAACGG ACAAGACGAA TGCATCTGCT TCTCGCCCAG AAAGGAACGA TCGCCGACGG 
CGGGGTTGCC TGTTCTGCTT ACGTAGACGA AGAGCGGGTC TTTCCTTGCT AGCGGCTGCC 

5110 5120 5130 5140 5150 5160 

CAACGAGGCG ATCGACCTTG GGCAAACGCC GGCCGATATC CTTTTCCTAT CGGCCGCCGA 
GTTGCTCCGC TAGCTGGAAC CCGTTTGCGG CCGGCTATAG GAAAAGGATA GCCGGCGGCT 

5170 5180 5190 5200 5210 5220 

CACCGAGCTC TCCTCGATCG CCGCGGCTCA CGGCCGACGC GACGGAGGCT TGAGCCTGCG 
GTGGCTCGAG AGGAGCTAGC GGCGCCGAGT GCCGGCTGCG CTGCCTCCGA ACTCGGACGC 

5230 5240 5250 5260 5270 5280 

CATCGCCAGC CTGATGAGCC TGATGCACCC GATGTCGGTC GACACTTACG TCGAGCGCAC 
GTAGCGGTCG GACTACTCGG ACTACGTGGG CTACAGCCAG CTGTGAATGC AGCTCGCGTG 

5290 5300 5310 5320 5330 5340 

GGCGCGTCAC GCCAAGCTGA TCGTCGTCCG GCCGCTCGGT GGCGCCAGCT ATTTCCGTTA 
CCGCGCAGTG CGGTTCGACT AGCAGCAGGC CGGCGAGCCA CCGCGGTCGA TAAAGGCAAT 

5350 5360 5370 5380 5390 5400 

TCTGCTGGAA GCCCTGCATG CGGCTGCCGT CACCCATCGT TTCGAGATCG CGGTTCTGCC 
AGACGACCTT CGGGACGTAC GCCGACGGCA GTGGGTAGCA AAGCTCTAGC GCCAAGACGG 
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6010 6020 6030 6040 6050 6060 

GGGGCTGATG GCGCGCGACC TCGCCATGAA CGTGGCACTC CCCGAAGTCG ATGGCCGCAT 
CCCCGACTAC CGCGCGCTGG AGCGGTACTT GCACCGTGAG GGGCTTCAGC TACCGGCGTA 

6070 6080 6090 6100 6110 6120 

CCTTGCGCGC GCCGTCTCCT TCAAGGCGGC GTCGATCTAT GACGCCAAGG TGGAGGCCAA 
GGAACGCGCG CGGCAGAGGA AGTTCCGCCG CAGCTAGATA CTGCGGTTCC ACCTCCGGTT 

6130 6140 6150 6160 6170 6180 

TATCGTCGGC CATGAGCCGC TCGAAGGCCG GGTGCGCTTT GCCGCTGATC TTGCCGTCAA 
ATAGCAGCCG GTACTCGGCG AGCTTCCGGC CCACGCGAAA CGGCGACTAG AACGGCAGTT 

6190 6200 6210 6220 6230 6240 

CTGGGCGAAC GTGCGCCGGG CAGAGCCCGC CGAGCGCCGT ATTGCCATCG TCATGGCCAA 
GACCCGCTTG CACGCGGCCC GTCTCGGGCG GCTCGCGGCA TAACGGTAGC AGTACCGGTT 

6250 6260 6270 6280 6290 6300 

CTATCCGAAC CGCGACGGTC GCCTCGGCAA CGGTGTCGGG CTCGACACGC CGGCCGGTAC 
GATAGGCTTG GCGCTGCCAG CGGAGCCGTT GCCACAGCCC GAGCTGTGCG GCCGGCCATG 

6310 6320 6330 6340 6350 6360 

CGTCGAGGTG CTTAGCGCCA TGGCGCGGGA AGGCTATGCG GTCGGTGAGG TTCCCGCCGA 
GCAGCTCCAC GAATCGCGGT ACCGCGCCCT TCCGATACGC CAGCCACTCC AAGGGCGGCT 

6370 6380 6390 6400 6410 6420 

TGGCGAGGCG CTGATCCGCT TTCTGATGGC CGGGCCGACC AATGCGGCGA GCCATGACCG 
ACCGCTGCGC GACTAGGCGA AAGACTACCG GCCCGGCTGG TTACGCCGCT CGGTACTGGC 

6430 6440 6450 6460 6470 6480 

TGAAATCCGC GAGCGTATTT CGCTGAACGA TTACAAAACG TTCTTCGATT CGCTTCCGAA 
ACTTTAGGCG CTCGCATAAA GCGACTTGCT AATGTTTTGC AAGAAGCTAA GCGAAGGCTT 

6490 6500 6510 6520 6530 6540 

ACAGATAAAG GATGAAGTTG CCGGTCGCTG GGGCGTGCCG GAGGCCGATC CCTTTTTCCT 
TGTCTATTTC CTACTTCAAC GGCCAGCGAC CCCGCACGGC CTCCGGCTAG GGAAAAAGGA 

6550 6560 6570 6580 6590 6600 

CGATGGCGCC TTCGCGCTGC CGCTCGCCCG CTTCGGCGAG GTGATCGTCG GCATCCAACC 
GCTACCGCGG AAGCGCGACG GCGAGCGGGC GAAGCCGCTC CACTAGCAGC CGTAGGTTGG 
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6610 6620 6630 6640 6650 6660 

GGCGCGCGGC TACAACATCG ATCCGAAGGA AAGCTACCAT TCCCCGGACC TCGTGCCGCC 
CCGCGCGCCG ATGTTGTAGC TAGGCTTCCT TTCGATGGTA AGGGGCCTGG AGCACGGCGG 

6670 6680 6690 6700 . 6710 6720 

GCATGGCTAT CTCGCCTTCT ACGCCTTCCT GCGCCAGCAG TTCGGAGCGC AGGCGATCGT 
CGTACCGATA GAGCGGAAGA TGCGGAAGGA CGCGGTCGTC AAGCCTCGCG TCCGCTAGCA 

6730 6740 6750 6760 ' 6770 6780 

CCACATGGGC AAGCACGGCA ATCTCGAATG GCTGCCGGGC AAGGCGCTGG CGCTGTCGGA 
GGTGTACCCG TTCGTGCCGT TAGAGCTTAC CGACGGCCCG TTCCGCGACC GCGACAGCCT 

6790 6800 6810 6820 6830 6840 

AACCTGCTAT CCCGAAGCGA TCTTCGGGCC GCTGCCGCAC ATCTATCCCT TCATCGTCAA 
TTGGACGATA GGGCTTCGCT AGAAGCCCGG CGACGGCGTG TAGATAGGGA AGTAGCAGTT 

6850 6860 6870 6880 6890 6900 

CGATCCGGGC GAAGGTACGC AGGCCAAGCG CCGCACCAGC GCCGTCATCA TCGACCACCT 
GCTAGGCCCG CTTCCATGCG TCCGGTTCGC GGCGTGGTCG CGGCAGTAGT AGCTGGTGGA 

6910 6920 6930 6940 6950 6960 

GACCCCGCCC TTGACGCGCG CCGAATCCTA CGGCCCGCTC AAGGATCTGG AAGCGCTCGT 
CTGGGGCGGG AACTGCGCGC GGCTTAGGAT GCCGGGCGAG TTCCTAGACC TTCGCGAGCA 

6970 6980 6990 7000 7010 7020 

CGACGAATAT TACGACGCCG CCGGCGGTGA TCCGCGCCGC CTCAGGCTGC TCAGCCGCCA 
GCTGCTTATA ATGCTGCGGC GGCCGCCACT AGGCGCGGCG GAGTCCGACG AGTCGGCGGT 

7030 7040 7050 7060 7070 7080 

GATCCTCGAT CTCGTGCGCG ACATCGGCCT CGACAGCGAC GCAGGCATCG ACAGGGGCGA 
CTAGGAGCTA GAGCACGCGC T<5TAGCCGGA GCTGTCGCTG CGTCCGTAGC TGTCCCCGCT 

7090 7100 7110 7120 7130 7140 

CAGCGACGAC AAGGCGCTGG AAAAGCTCGA CGCCTATCTC TGCGACCTCA AGGAAATGCA 
GTCGCTGCTG TTCCGCGACC TTTTCGAGCT GCGGATAGAG ACGCTGGAGT TCCTTTACGT 

7150 7160 7170 7180 7190 7200 

GATCCGCGAC GGCCTGCACA TCTTCGGCGT TGCGCCGGAA GGGCGGTTGT TGACGGACCT 
CTAGGCGCTG CCGGACGTGT AGCCGCCGCA ACGCGGCCTT CCCGCCAACA ACTGCCTGGA 
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7210 7220 7230 7240 7250 7260 

CACCGTAGCG CTGGCGCGCG TGCCCCGAGG TCTCGGCGAG GGCGGCGACC AGAGCCTGCA 
GTGGCATCGC GACCGCGCGC ACGGGGCTCC AGAGCCGCTC CCGCCGCTGG TCTCGGACGT 

7270 7280 7290 7300 7310 7320 

GCGGGCGATC GCAGCGGATG CGGGGCTGCG TGGGTTTGCT ATTCCCACCT CGGCGGGGGG 
CGCCCGCTAG CGTCGCCTAC GCCCCGACGC ACCCAAACGA TAAGGGTGGA GCCGCCCCCC 

7330 7340 7350 7360 7370 7380 

CAACCCCGCA CGCGACGCCC AACCCTTCGA CCCGCTCGAC TGCGTCATGT CCGACACCTG 
GTTGGGGCGT GCGCTGCGGG TTGGGAAGCT GGGCGAGCTG ACGCAGTACA GGCTGTGGAC 

7390 7400 7410 7420 7430 7440 

GACAGGCCCG AAACCGTCCA TCCTCGCTGA CCTCTCGGAC GCCCCCTGGC GCACCGCCGG 
CTGTCCGGGC TTTGGCAGGT AGGAGCGACT GGAGAGCCTG CGGGGGACCG CGTGGCGGCC 

7450 7460 7470 7480 7490 7500 

CGATACGGTC GAGCGCATCG AGTTGCTTGC CGCAAATCTC GTGTCGGGTG AACTGGCTTG 
GCTATGCCAG CTCGCGTAGC TCAACGAACG GCGTTTAGAG CACAGCCCAC TTGACCGAAC 

7510 7520 7530 7540 7550 7560 

CCCGGACCAC TGGGCCAACA CCCGCGCCGT GCTCGGCGAA ATCGAAACGC GCCTGAAGCC 
GGGCCTGGTG ACCCGGTTGT GGGCGCGGCA CGAGCCGCTT TAGCTTTGCG CGGACTTCGG 

7570 7580 7590 7600 7610 7620 

GTCGATTTCA AACTCGGGTG CCGCCGAGAT GACCGGCTTC CTCACCGGTC TCAGCGGCCG 
CAGCTAAAGT TTGAGCCCAC GGCGGCTCTA CTGGCCGAAG GAGTGGCCAG AGTCGCCGGC 



7630 
CTTCGTCGCC 
GAAGCAGCGG 



7640 
CCCGGTCCAT 
GGGCCAGGTA 



7650 
CGGGCGCGCC 
GCCCGCGCGG 



7660 
GACGCGCGGC 
CTGCGCGCCG 



7670 
CGGCCGGATG 
GCCGGCCTAC 



7680 
TGTTGCCGAC 
ACAACGGCTG 



7690 7700 7710 7720 7730 7740 

GGGGCGCAAT TTCTACTCGG TCGACAGCCG CGCCGTGCCG ACGCCGGCGG CTTACGAGCT 
CCCCGCGTTA AAGATGAGCC AGCTGTCGGC GCGGCACGGC TGCGGCCGCC GAATGCTCGA 

7750 7760 7770 7780 7790 7800 

TGGCAAGAAA TCGGCCGAGC TTCTGATCCG CCGCTACCTG CAGGACCATG GCGAATGGCC 
ACCGTTCTTT AGCCGGCTCG AAGACTAGGC GGCGATGGAC GTCCTGGTAC CGCTTACCGG 
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7810 7820 7830 

GTCCTCCTTT GGCCTGACCG CCTGGGGCAC 
CAGGAGGAM CCGGACTGGC GGACCCCGTG 

7870 7880 7890 

CGCCCAGGCC CTGGCGCTGA TCGGCGCCAA 
GCGGGTCCGG GACCGCGACT AGCCGCGGTT 

7930 7940 7950 

GATGGGCTAC GAGATCGTGC CGCTCGCAGT 
CTACCCGATG CTCTAGCACG GCGAGCGTCA 

7990 8000 8010 

GCGCATTTCC GGCTTCTTCC GCGATGCCTT 
CGCGTAAAGG CCGAAGAAGG CGCTACGGAA 

8050 8060 8070 

GATCCGCGCC GTCGCGCTGG AGGAAGACGA 
CTAGGCGCGG CAGCGCGACC TCCTTCTGCT 

8110 8120 8130 

GGCGGAAAGC CGGCGGCTGG AGGCCGAAGG 
CCGCCTTTCG GCCGCCGACC TCCGGCTTCC 

8170 8180 8190 

CTCCTACCGC GTCTTTGGCG CAAAGCCCGG 
GAGGATGGCG CAGAAACCGC GTTTCGGGCC 

8230 8240 8250 

CGACGAGAAG GGCTGGGAAA CCAAAGCAGA 
GCTGCTCTTC CCGACCCTTT GGTTTCGTCT 

8290 8300 8310 

CTATGCCTAT GGCGCCGGCG AGGAGGGCAA 
GATACGGATA CCGCGGCCGC TCCTCCCGTT 

8350 8360 8370 

GCGCACGATA GAGGCGGTGG TGCAGAACCA 
CGCGTGCTAT CTCCGCCACC ACGTCTTGGT 
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8270 
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8410 8420 8430 

CGACGACTAC TACCAGTTCG AAGGCGGCAT 
GCTGCTGATG ATGGTCAAGC TTCCGCCGTA 

8470 8080 8490 

CCGTCCGGCG ATCTACCACA ACGACCATTC 
GGCAGGCCGC TAGATGGTGT TGCTGGTAAG 

8530 8540 8550 

CGAAGAAGAG ATCGGCCGCG TGGTCCGGGC 
GCTTCTTCTC TAGCCGGCAC ACCAGGCCCG 

8590 8600 8610 

CGTCATGCGC CACGGATACA AGGGCGCCTT 
GCAGTACGCG GTGCCTATGT TCCCGCGGAA 

8650 8660 8670 

CGCCTTTGCC GCGACCACGG GTGCGGTGCG 
GCGGAAACGG CGCTGGTGCC CACGCCACGC 

8710 8720 8730 

GTTCATTGTC GACGAGCGCG TGGCTGACTT 
CAAGTAACAG CTGCTCGCGC ACCGACTGAA 

8770 8780 8790 

CGAGCTTGCC GAACGCCTGC TTGAAGCAAT 
GCTCGAACGG CTTGCGGACG AACTTCGTTA 

8830 8840 8850 

TTCGGCGCGG TTTGAACTTG CCGGCATCGG 
AAGCCGCGCC AAACTTGAAC GGCCGTAGCC 

8890 8900 8910 

TGAATAGAGC GGTTCCGGGC TGGCGGTTAT 
ACTTATCTCG CCAAGGCCCG ACCGCCAATA 

8950 8960 8970 

.GGTTCCGTTT CGCTGCTCAG TGAAGTGCGA 
CCAAGGCAAA GCGACGAGTC ACTTCACGCT 
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9010 9020 9030 9040 9050 9060 

CCCATCCCGA ACTTGAGMC TGAGGGAGTG ATCATGAGCG ACGAGACGAC AGTAGGCGGC 
GGGTAGGGCT TGAACTCTTG ACTCCCTCAC TAGTACTCGC TGCTCTGCTG TCATCCGCCG 

9070 9080 9090 9100 9110 9120 

GAAGCCCCGG CCGAGAAGGA CGATGCCCGC CACGCCATGA AGATGGCGAA GAAGAAGGCA 
CTTCGGGGCC GGCTCTTCCT GCTACGGGCG GTGCGGTACT TCTACCGCTT CTTCTTCCGT 

9130 9140 9150 9160 9170 9180 

GCACGCGAAA AGATCATGGC GACGAAGACC GACGAGAAGG GTCTGATCAT CGTCAACACC 
CGTGCGCTTT TCTAGTACCG CTGCTTCTGG CTGCTCTTCC CAGACTAGTA GCAGTTGTGG 

9190 9200 9210 9220 9230 9240 

GGCAAAGGCA AGGGCAAGTC GACCGCCGGC TTCGGCATGA TCTTCCGCCA TATCGCCCAC 
CCGTTTCCGT TCCCGTTCAG CTGGCGGCCG AAGCCGTACT AGAAGGCGGT ATAGCGGGTG 

9250 9260 9270 9280 9290 9300 

GGCATGCCCT GCGCCGTCGT GCAGTTCATC AAGGGTGCGA TGGCAACCGG CGAGCGCGAG 
CCGTACGGGA CGCGGCAGCA CGTCAAGTAG TTCCCACGCT ACCGTTGGCC GCTCGCGCTC 

9310 9320 9330 9340 9350 9360 

TTGATCGAGA AGCATTTCGG CGATGTCTGC CAGTTCTACA CGCTCGGCGA GGGCTTCACC 
AACTAGCTCT TCGTAAAGCC GCTACAGACG GTCAAGATGT GCGAGCCGCT CCCGAAGTGG 

9370 9380 9390 9400 9410 9420 

TGGGAAACGC AGGATCGCGC CCGCGATGTT GCGATGGCTG AAAAGGCCTG GGAGAAGGCG 
ACCCTTTGCG TCCTAGCGCG GGCGCTACAA CGCTACCGAC TTTTCCGGAC CCTCTTCCGC 

9430 9440 9450 9460 9470 9480 

AAGGAACTGA TCCGTGACGA GCGCAACTCG ATGGTGCTGC TCGACGAGAT CAACATTGCT 
TTCCTTGACT AGGCACTGCT CGCGTTGAGC TACCACGACG AGCTGCTCTA GTTGTAACGA 

9490 9500 9510 9520 9530 9540 

CTGCGCTACG ACTACATCGA CGTCGCCGAA GTGGTGCGCT TCCTGAAGGA AGAAAAGCCG 
GACGCGATGC TGATGTAGCT GCAGCGGCTT CACCACGCGA AGGACTTCCT TCTTTTCGGC 

9550 9560 9570 9580 9590 9600 

CACATGACGC ATGTGGTGCT CACCGGCCGC AACGCGAAAG AAGACCTGAT CGAAGTCGCC 
GTGTACTGCG TACACCACGA GTGGCCGGCG TTGCGCTTTC TTCTGGACTA GCTTCAGCGG 
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9610 9620 9630 9640 9650 Qf^an 

GATCTCGTCA CTGAGATGGA GCTGATCAAG CATCCGTTCC GTTCCGGCAT CAAGGfrPAr 
CTAGAGCAGT GACTCTACCT CGACTAGTTC GTAGGCAAGG ScGtI otSc 

9670 9680 9690 9700 9710 q79n 

CAGGGCGTGG AGTTCTGATG AGCCAGAGCT GGCAGTTCTG GGCGCTGCTT TCGGCCrrcT 
GTCCCGCACC TCAAGACTAC TCGGTCTCGA CCGTCAAGAC CCGCGACgII LcCgS 

9730 9740 9750 9760 9770 9700 

TCGCTGCGCT CACGGCGGTG TTTGCCAAGG TCGGGGTTGC GCAGATCAAC TCCGACT-rrr 
AGCGACGCGA GTGCCGCCAC AAACGGTTCC AGCCCCAACG CGTCTAGTTG IgGcSgC 

9790 9800 9810 9820 9830 qft4n 

CAACGCTGAT CCGCACCGTC GTCATCCTCT GCGTGATCGC CGCCATCGTG GCGGCfAfAr 
GTTGCGACTA GGCGTGGCAG CAGTAGGAGA CGCACTAGCG GCotS CotSc 

9850 9860 9870 9880 9890 QQOn 

GGCAGTGGCA GAAGCCATCG gaaatcccgg gccgcacctg gctgttcctg gcgctgtcag 
ccgtcaccgt cttcggtagc ctttagggcc cggcgtggac cgacmggIc cgcSc 

9910 9920 9930 9940 9950 9Qfin 

CCGMrrrTP Ifrrrrll^r I?'™ ATTTCCGCGC GCTGAAGCTC GGCGACGCCG 

ccgaacgctg accgcgaagg accgaacgga taaaggcgcg cgacttcgag ccgctgcggc 

9970 9980 9990 10000 10010 10020 

CCCGCGTGGC GCCGCTCGAC AAGCTCTCGA TCGTCATGGT CGCGATCTTC GGCGTrrTPT 

gggcgcaccg cggcgagctg ttcgagagct agcagtacca gcgSg ccracGlGl 

10030 10040 10050 10060 10070 lOOftO 
TCCTCGGTGA AAAGCTCAAC CTGATGAACT GGCTCGGCGT CGCCTTCATT GCCGCCGGfr 
AGGAGCCACT TTTCGAGTTG GACTACTTGA ■ CCGAGCCGCA GCGGMGTM SgGCCCC 

10090 10100 10110 10120 10130 10140 

GcSrrI rrrrrlrlll l^t^'^'''' TGCTCTGGTG CCTGTTCACT GAATgJtCGC 
GCGACAACGA CCGCCACAAA ACTCGCGCGG ACGAGACCAC GGACAAGTGA cttacgagcg 

10150 10160 10170 10180 10190 10200 
CTCAATCAAT CCGTAATCCC GACACATGCA GTGGTTGTGA CGAGCGGGAG GACGGCATGC 

gagttagtta ggcattaggg ctgtgtacgt caccaacact GacGCOTc Sgccg?Icg 
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10210 10220 10230 10240 10250 10260 
AGATTGAAGG CAATTGGAGC GAGCGCCTTC CTGATCCGTC GGGCCACGTC GCGCAGTTCG 
TCTAACTTCC GTTAACCTCG CTCGCGGAAG GACTAGGCAG CCCGGTGCAG CGCGTCAAGC 

10270 10280 10290 10300 10310 10320 
GCAGACGCTG GAAGCGTCGC AGCCTGAGGG TGAGCCCTGC TTCAGACCCA CCGGCGGACA 
CGTCTGCGAC CTTCGCAGCG TCGGACTCCC ACTCGGGACG AAGTCTGGGT GGCCGCCTGT 

10330 10340 10350 10360 10370 10380 
CGCCTGCAAT AGGCACCGTA GGCGTCGCCG AAGACCTTGG CGAGGTGGGT TTCCTCCATG 
GCGGACGTTA TCCGTGGCAT CCGCAGCGGC TTCTGGAACC GCTCCACCCA AAGGAGGTAC 

10390 10400 10410 10420 10430 10440 
CGGATCTGGT AGGAAATCGA GATCCAGGCG GAGAGCGCCA GCGCCACCGA GATGACGTTG 
GCCTAGACCA TCCTTTAGCT CTAGGTCCGC CTCTCGCGGT CGCGGTGGCT CTACTGCAAC 

10450 10460 10470 10480 10490 10500 
GGCACCGCCA TCACCGTGCC GATCAGCGCG GTCACCATGC CGACATAGAT CGGGTTGCGC 
CCGTGGCGGT AGTGGCACGG CTAGTCGCGC CAGTGGTACG GCTGTATCTA GCCCAACGCG 

10510 10520 10530 10540 10550 10560 
GAGAAGGCAT AGAGGCCTGA GGTCACAAGC GGCGCGTCCT GCTTTTCAGG GATGCCGATC 
CTCTTCCGTA TCTCCGGACT CCAGTGTTCG CCGCGCAGGA CGAAAAGTCC CTACGGCTAG 

10570 10580 10590 10600 10610 10620 
TTCCAGGAAT GACGCATCGC CCATTGCGAC AGCATCGTCA GCCCGCCGCC GAGCGTCATC 
AAGGTCCTTA CTGCGTAGCG GGTAACGCTG TCGTAGCAGT CGGGCGGCGG CTCGCAGTAG 

10630 10640 10650 10660 10670 10680 
AGCGCCAGGC CGACGGCGTG AAGGATGGGC GTGTCGAGCG CCGGGATCCG GCCGAGGGCA 
TCGCGGTCCG GCTGCCGCAC TTCCTACCCG CACAGCTCGC GGCCCTAGGC CGGCTCCCGT 

10690 10700 10710 10720 10730 10740 
GCATCGACGG AGGCCGGGAG CATGGCGACC GCCAGCAGGT GGATCACCAG CGCTGCGACG 
CGTAGCTGCC TCCGGCCCTC GTACCGCTGG CGGTCGTCCA CCTAGTGGTC GCGACGCTGC 

10750 10760 10770 10780 10790 10800 
ATCAGGGGGA AAAGCCTGCC CGCAAACCCT TCCGCATCGT CGCCATAGGT TAGCACGACC 
TAGTCCGCCT TTTCGGACGG GCGTTTGGGA AGGCGTAGCA GCGGTATCCA ATCGTGCTGG 
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10810 10820 10830 10840 10850 10860 
GGCGAGCGGC CGGATTGCAC GCGGCGGAGG ATCGCCAGCG CGAGCGTGGA CAATCCCACG 
CCGCTCGCCG GCCTAACGTG CGCCGCCTCC TAGCGGTCGC GCTCGCACCT GTTAGGGTGC 

10870 10880 10890 10900 10910 10920 

ACGAGCATCA GGATGGTGGG AAGGGTGGTG GACATGGAAA CCTCTGGAGC GAGCTGACAA 
TGCTCGTAGT CCTACCACCC TTCCCACCAC CTGTACCTTT GGAGACCTCG CTCGACTGTT 

10930 10940 10950 10960 10970 10980 
GACAGGAGCG CACGACGGGT AGGCGGCCCA TATGAGCGTC TACCCGGCGA AGCATTCTGA 
CTGTCCTCGC GTGCTGCCCA TCCGCCGGGT ATACTCGCAG ATGGGCCGCT TCGTAAGACT 



10990 11000 
TCACCTTGCA ATCTCTAGTA 
AGTGGAACGT TAGAGATCAT 



11010 11020 
ACTAGAGGTT CAAGCGTCGG 
TGATCTCCAA GTTCGCAGCC 



11030 11040 
ACCTGTCCGA CTTTCGTCGT 
TGGACAGGCT GAAAGCAGCA 



11050 11060 11070 11080 11090 11100 
GGTTACCGGA TCTTATTGCC AAGCGTTGGA GGCTGTCATC GTCGCCCCCG CCGTGTCGGA 
CCAATGGCCT AGAATAACGG TTCGCAACCT CCGACAGTAG CAGCGGGGGC GGCACAGCCT 

11110 11120 11130 11140 11150 11160 
AGGTCGGCAA AATTCGTCTC TTGACGGCTG CTCCTTCCGT CGAGCGATTG CATAGGCAGG 
TCCAGCCGTT TTAAGCAGAG AACTGCCGAC GAGGAAGGCA GCTCGCTAAC GTATCCGTCC 

11170 11180 11190 11200 11210 11220 
AGGCCGCACC CATGTTAGAC CGTCGACAGG CTAAATACGG GTGAACCTTG AAGAATACTC 
TCCGGCGTGG GTACAATCTG GCAGCTGTCC GATTTATGCC CACTTGGAAC TTCTTATGAG 

11230 11240 11250 11260 11270 11280 
TCAGAGCTGC GGTTGGTGTC GCATCGGTCT TGCTGTTCTT GTCATCAGGT GTGGCGGGGC 
AGTCTCGACG CCAACCACAG CGTAGCCAGA ACGACAAGAA CAGTAGTCCA CACCGCCCCG 

11290 11300 11310 11320 11330 11340 
AGGCGCAAAC CGTGAAGAGC GGGGCGTCAC GAGCTCAAGA AACGACGACC ACCCAGAAGG 
TCCGCGTTTG GCACTTCTCG CCCCGCAGTG CTCGAGTTCT TTGCTGCTGG TGGGTCTTCC 

11350 11360 11370 11380 11390 11400 
CGAAACCGAA AACTAAAACG ACGCGCAAGC AAAGGGCTGC GGATGAAGCC AAGGCCAAGG 
GCTTTGGCTT TTGATTTTGC TGCGCGTTCG TTTCCCGACG CCTACTTCGG TTCCGGTTCC 
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11410 11420 11430 11440 11450 11460 
CGCTCGCCGA AGCGCGCCGT CCACGGATTT GCAAGACGCG GGAGAGCGAA TGCAGCTATG 
GCGAGCGGCT TCGCGCGGCA GGTGCCTAAA CGTTCTGCGC CCTCTCGCTT ACGTCGATAC 

11470 11480 11490 11500 11510 11520 
GCGCAGGTCC GGTCGGAGAG CAGTGCTCGT GCTGGTCGAA ATCCGGTGCG CCTGATCTTG 
CGCGTCCAGG CCAGCCTCTC GTCACGAGCA CGACCAGCTT TAGGCCACGC GGACTAGAAC 

11530 11540 11550 11560 11570 11580 
GCATAACTGT CAGGCGTTGA CCGCCCGCGA CCTTCGCGCG GGCAGGCAAG CGTGCGTCGC 
CGTATTGACA GTCCGCAACT GGCGGGCGCT GGAAGCGCGC CCGTCCGTTC GCACGCAGCG 

11590 11600 11610 11620 11630 11640 
TCGAAGCGAC GCCTGACGCG ATAGAAATCA CGGGTCGCCT GGTTCGTTCT GAAAGCTTGG 
AGCTTCGCTG CGGACTGCGC TATCTTTAGT GCCCAGCGGA CCAAGCAAGA CTTTCGAACC 

11650 11660 11670 11680 11690 11700 
GATTGGGTTT AGGTGATGGA AGCCGGCGTT GAACGCAAAA TAATGATCGA TCTCGAGAAC 
CTAACCCAAA TCCACTACCT TCGGCCGCAA CTTGCGTTTT ATTACTAGCT AGAGCTCTTG 

11710 11720 11730 11740 11750 11760 
AGCGCGCTCC AGTTTGCAAC CCGAGCACAC GGCGAACAGA AGCGTAAGTA TGACGGTCGG 
TCGCGCGAGG TCAAACGTTG GGCTCGTGTG CCGCTTGTCT TCGCATTCAT ACTGCCAGCC 

11770 11780 11790 11800 11810 11820 
CCCTATATCG TTCATCCGAT TGCGGTGGCG GAGATTGTTC GAAGCGTGCC CCATACGCCC 
GGGATATAGC AAGTAGGCTA ACGCCACCGC CTCTAACAAG CTTCGCACGG GGTATGCGGG 

11830 11840 11850 11860 11870 11880 
GAAATGATCG CCGCAGCGCT GCTTCACGAT ACGGTCGAAG ATACCGACGC GACGCTGCTG 
CTTTACTAGC GGCGTCGCGA CGAAGTGCTA TGCCAGCTTC TATGGCTGCG CTGCGACGAC 

11890 11900 11910 11920 11930 11940 
GAGATCAAGG AAGCGTTCGG CCCCAAGGTC GCAACACTGG TTGCGTGGCT CACCGACATA 
CTCTAGTTCC TTCGCAAGCC GGGGTTCCAG CGTTGTGACC AACGCACCGA GTGGCTGTAT 

11950 11960 11970 11980 11990 12000 
TCCACTCCGT TCCACGGCAA CCGACAGGTG CGCAAGGAAC TGGATCGCCA GCACCTCGCA 
AGGTGAGGCA AGGTGCCGTT GGCTGTCCAC GCGTTCCTTG ACCTAGCGGT CGTGGAGCGT 
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12010 12020 12030 12040 12050 12060 
TCGGCGCCCG CCGCGGCGAA AACCGTCAAG CTCGCCGACC TGATCGACAA TGCGATAGCG 
AGCCGCGGGC GGCGCCGCTT TTGGCAGTTC GAGCGGCTGG ACTAGCTGTT ACGCTATCGC 

12070 12080 12090 12100 12110 12120 
ATCAAAGCCG GCGATCCGAA TTTCTGGAAA GTGTTCGGCG CCGAGATGAA ACGCTTGCTG 
TAGTTTCGGC CGCTAGGCTT AAAGACCTTT CACAAGCCGC GGCTCTACTT TGCGAACGAC 

12130 12140 12150 12150 12170 12180 
GAGGTCTTGG GCGACGGCGA CGAGACCCTT CTCGCAAAGG CCCGTGCATT AGCGCCGGAA 
CTCCAGAACC CGCTGCCGCT' GCTCTGGGAA GAGCGTTTCC GGGCACGTAA TCGCGGCCTT 

12190 12200 12210 12220 12230 12240 
TGAGAGTGCC GCCGTTTATC GGCAAGCATG TCTGTGCCAT GTCGACCCGG TCAACCGGTC 
ACTCTCACGG CGGCAAATAG CCGTTCGTAC AGACACGGTA CAGCTGGGCC AGTTGGCCAG 

12250 12260 12270 12280 12290 12300 
ATCCAAGATC GCAGAACGGA CATGCATTTG CGGTTTTGCC CGCCGGTGTG GCCCAGCCAC 
TAGGTTCTAG CGTCTTGCCT GTACGTAAAC GCCAAAACGG GCGGCCACAC CGGGTCGGTG 

12310 12320 12330 12340 12350 12360 
GCCTCACAGG CTGCGCGGTT GCGGCCGTTA GGACAGCGCA GAATTTGCCG ACCGCGCCGC 
CGGAGTGTCC GACGCGCCAA CGCCGGCAAT CCTGTCGCGT CTTAAACGGC TGGCGCGGCG 

12370 12380 12390 12400 12410 12420 
GCCTCAATGC CCCAGCCAGA TCCGCAAGGG ATGCGTCGGA TCTGCGAGCA GCCGGATCGC 
CGGAGTTACG GGGTCGGTCT AGGCGTTCCC TACGCAGCCT AGACGCTCGT CGGCCTAGCG 

12430 12440 12450 12460 12470 12480 
GAGCGCGATC GAGACGATGA CGAGCAGCGG CTTGATGATC TTGGCGCCCT TGGCCATGGC 
CTCGCGCTAG CTCTGCTACT GCTCGTCGCC GAACTACTAG AACCGCGGGA ACCGGTACCG 

12490 12500 12510 12520 12530 12540 
ATAGCGCGAG CCGACCTGGG CGCCGAGGAA CTGGCCGAGG CCCATCAACA GGCCGACCTT 
TATCGCGCTC GGCTGGACCC GCGGCTCCTT GACCGGCTCC GGGTAGTTGT CCGGCTGGAA 

12550 12560 12570 12580 12590 12600 
CCAGAGAACG GCGCCGAAGA AGAGGAAGAC GCCGAAGGCG CCGACGTTGG AGCCAAAGTT 
GGTCTCTTGC CGCGGCTTCT TCTCCTTCTG CGGCTTCCGC GGCTGCAACC TCGGTTTCAA 
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12610 


12620 

J, \J c. \J 


12630 


12640 




1 26fin 


GAGGAACTTC 


GTGTGCGCCG 


TCGCCTTCAA 


CACGCCGAAG 


CCGGCGAGGG 


TAACGAAGCC 


CTCCTTGAAG 


CACACGCGGC 






GGCCGCTCCC 


ATTGCTTCGG 


12670 


12680 


12690 


12700 


12710 


1 2720 


GAGCATGAAG 


AACGAGCCGG 


TGCCGGGGCC 


GAAGACGCCG 


TCATAAAAGC 


CGATTAGCGG 


CTCGTACTTC 


TTGCTCGGCC 


nrrrrrrrrr 


pmmpmppppp 

bi Ibibbbbb 


AGTATTTTCG 


GCTAATCGCC 


12730 


12740 


12750 


12760 


12770 


1 27fin 


CACCAGTGTC AGCGTGAAGA 


CGAAGGGGGT 


GACGCGGCTG 


TGCTGGTCGA 


CGTCGCCCAT 


GTGGTCACAG 


TCGCACTTCT 




bibbbbbb/\b 


ACGACCAGCT 


GCAGCGGGTA 


12790 


12800 


12810 


12820 


1 2830 

X ^ U J u 


1 2R40 


GTTCGGCTTC 


AGGCCGAAAT 


AAAGCGCAAT 


GGCGATCAGC 


AGAAAGGGCA 


GGATCGCCTT 


CAAGCCGAAG TCCGGCTTTA 


i i iLbbbi in 


pppprpTVpoipp 
bbbbiHbibb 


TCTTTCCCGT 


CCTAGCGGAA 


12850 


12860 


12870 


12880 


1 28Q0 

X ^ O -? u 


1 2Qno 


CAGCACGTCG 


CCGGGAACGA 


TGGTTGCGAG 


CAGGGCGCCG 


AGCACGGCGC 


CGGCGGCCGA 


GTCGTGCAGC 


GGCCCTTGCT 


/\bLi\i\bbbiL 


bibbbbbbbb 


TCGTGCCGCG 


GCCGCCGGCT 


12910 


12920 


12930 


12940 


XZ. J^\J 




CATCAGCGCC ATCGGCAGCT 


GCTCTTTCAG 


GTTCACGTGG 


CCGCGCCGGG 


CATAGGACAG 


GTAGTCGCGG 


TAGCCGTCGA 


LbAbAAAb i b 


bAAbibbAbb 


GGCGCGGCCC 


GTATCCTGTC 


12970 


12980 


12990 


13000 


13010 


1 '^020 


CGTGGCCGAG CCGGAGCCGA 


ACAATCCCTG 


CAGCTTGTTG 


GTGCCGAGCG 


TCTGCAAGGG 


GCACCGGCTC 


GGCCTCGGCT 


ibi i/ibbbAL 


PTPPA 7VP3\ RP 

bibbAAbAAb 


CACGGCTCGC 


AGACGTTCCC 


13030 


13040 


13050 


13060 


1 3070 


1 "^fifiO 

J. JUOU 


CGGGATGCCC 


GCAATGAGCA 


IGGLLbGAAi 


bblbAitAlG 


CCACCGCCGC 


CGGCGATCGA 


GCCCTACGGG CGTTACTCGT 


ACCGGCCTTA 


CCACTAGTAC 


GGTGGCGGCG 


GCCGCTAGCT 


13090 


13100 


13110 


13120 


13130 


13140 


ATCGATGAAG 


CCTGCGATGA 


AGGCGGCGAC 


GAACAGGAAG 


GCGAGCAGGT 


GGAAGGCGAG 


TAGCTACTTC 


GGACGCTACT 


TCCGCCGCTG 


CTTGTCCTTC 


CGCTCGTCCA 


CCTTCCGCTC 


ATCT 












TAGA 




FIG. 


43 V 
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RESTRICTION MAP OF THE 13144-BP SEQUENCE 



ApaLI 


642, 










EcoRI 


8818, 










Hindlll 


11633, 










Mlul 


7963, 










Ndel 


10950, 










PvuII 


12918, 










Sfil 


3133, 










SplI 


99, 










Bglll 


8248, 


13139, 








Kpnl 


2315, 


6300, 








Not I 


5526, 


7615, 








Smal 


1322, 


9868, 








Sspl 


4843, 


6968, 








XmnI 


9313, 


12091, 








Aatll 


1033, 


9503, 


12773, 






Afllll' 


550, 


7963, 


8634, 






Ball 


2107, 


6236, 


12473, 






BamHI 


2266, 


5416, 


10664, 






BspMII 


5002, 


8494, 


8914, 






EcoRV 


4263, 


4605, 


5137, 






Ncol 


631,8, 


7786, 


12474, 






Nsil 


3467, 


5064, 


12266, 






PflMI 


7870, 


10718, 


11065, 






Xhol 


1512, 


4171, 


11692, 






Apal 


1928, 


3138, 


3386, 


8551, 




AsuII 


784, 


5670, 


8418, 


11799, 




FspII 


784, 


5670, 


8418, 


11799, 




Mae I 


1883, 


2647, 


10995, 


11002, 




Nrul 


1827, 


3794, 


10002, 


12419, 




Saul 


852, 


7001, 


10284, 


10517, 




BstEII 


995, 


3642, 


8456, 


10470, 


11041, 


Eco47III 


6954, 


7209, 


8434, 


10731, 


11837, 


Sad 


5, 


4109, 


4694, 


5169, 


11315, 


StuI 


204, 


4081, 


8261, 


9406, 


10515, 


BstXI 


761, 


2982, 


3612, 


6031, 


6232, 


SacII 


932, 


1025, 


2096, 


3537, 


5184, 


SphI 


966, 


2740, 


5360, 


8098, 


9246, 



9102, 



F/G. 44A 
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RESTRICTION MAP OF THE 13144-BP SEQUENCE 



Bell 


2992, 


4016, 


9029, 


9164, 


9623, 


10978, 


13053, 




Rsal 


101, 


1201, 


1918, 


2313, 


4881, 


6298, 


6856, 




Tthllll 


1821, 


2424, 


4351, 


7361, 


7904, 


12227, 


12697, 




PstI 


613, 


3989, 


5832, 


5952, 


7260, 


7782, 


8211, 


12992, 


Clal 


1351, 


3596, 


4469, 


4724, 


5748, 


6618, 


8574, 


11687, 




13082, 












Fspl 


1363, 


1551, 


1653, 


5219, 


7841, 


7982, 


8342, 


9760, 




11971, 














Hinfl 


1137, 


2564, 


2592, 


3025, 


5667, 


5927, 


6467, 


6923, 




13079, 












Styl 


2488, 


3396, 


5116, 


6105, 


6318, 


7786, 


9745, 


10355, 




11389, 


11395, 


11903, 


12468, 


12474, 








Ddel 


852, 


1875, 


3373, 


3586, 


6311, 


7001, 


7010, 


7610, 




8956, 


9020, 


9611, 


10284, 


10517, 


11220, 






Nsp7524I 


554, 


966, 


2394, 


2740, 


5360, 


7840, 


8098, 


8638, 




9246, 


9553, 


10168, 


10199, 


12210, 


12264, 






Pvul 


26, 


1853, 


2453, 


4403, 


4703, 


4728, 


5091, 


5112, 




5178, 


6717, 


7269, 


9991, 


12429, 


13077, 






Aval 


975, 


1320, 


1503, 


1512, 


3131, 


3231, 


3709, 


3766, 




4171, 


4212, 


7224, 


7573, 


9866, 


11692, 


11720, 




Banll 


5, 


496, 


1723, 


1928, 


2254, 


3138, 


3386, 


4109, 




4694, 


5196, 


6207, 


6282, 


8551, 


10296, 


11315, 




Sail 


83, 


1296, 


2418, 


4045, 


4303, 


5258, 


6959, 


7700,. 




7967, 


8627, 


8708, 


9198, 


11182, 


12221, 


12766, 




XhoII 


2266, 


3920, 


5416, 


5688, 


6943, 


7020, 


7140, 


8248, 




10382, 


10400, 


10664, 


11048, 


12378, 


12398, 


13139, 
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OPEN READING FRAME 22 
SEQUENCE: 1 3144-BP FROM 1 TO 2266 LENGTH = 2266 \ 



PHASES 



PHASE 2: 



PHASE 1: 



:AJ!:.i....\t h Y 






. V ' 

- — 1 i-i-J 


^TGr429) TAG 18841^ 




1 1 ' 1 « . 


1 1 1 


-<I UK 1 1 II 1 i 1 II 1 

:\ /Ha a . . jVv^ a ^ 





226 452 678 904 1130 1356 1582 1808 2034 

OPEN READING FRAME 23 
SEQUENCE: 1 3144-BP FROM 2266 TO 4968 LENGTH = 1735 



PHASES: 



PHASE 2: 



PHASE 1 : 




I I I 



An. hA 



A 



II I 




1 1 I I 



TGA(3886) 



2438 2611 2784 2957 3130 3303 3476 3649 3822 / 

F/G. 45A 
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SEQUENCE: 3144-BP FROM 3800 TO 5000 LENGTH = 1201 



frames: 



FRAME 2 



FRAME 1 







: ATG (3892) 


TAG (4954) a' 




:A aa' . 




- II III 1 II 1 i 


1 It 1 II 


3919 4039 4159 4279 4399 


4519 4639 4759 4879 




OPEN READING FRAME 24 

F/G. 45B 
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SEQUENCE: 13144-BP FROM 5000 TO 9000 LENGTH = 4001 



FRAMES 



FRAME 2 



FRAME 1 




I II I I 



I I 



1 » 



II II 1 I II It II 11 I I I I I I I 



■ " « * ' 



' . .. »» . , ■ . 



5399 5799 6199 6599 6999 7399 7799 8199 8599 



CS 

FRAMES 
CS 

FRAME 2 
CS 

FRAME 1 



fill I III 



MM M II. 



I 11 II I 




A 



OPEN READING FRAME 25 

FIG. 45C 



A 
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SEQUENCE: 1344-BP FROM 9000 TO 9700 LENGTH = 701 



FRAMES 



FRAME 2: 



FRAME 1 





II II 




■ 


•- ATG (9034) 


fGA(86'76r\ 


I ' u 1 1 1 1 1 l_ 1 1 


^ ^ ^ / 


9069 9139 9209 9279 9349 9419 


9489 9559 9629 



FIG. 45D 
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SEQUENCE: 1 344-BP FROM 9600 T0 1 3144 LENGTH = 2545 \ 



frames: 



FRAME 2: 



FRAME 1 




9953 10367 10661 11015 11369 11723 12077 12431 12785 




A = ATG OPEN READING FRAME 26, 27, 28, 29, 30 

*: STOP CODON 

CS: COMPLEMENTARY STRAND 



F/G. 45E 
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on 



CVJ 



no vr> 



oo_ 



OsJ- 



oo - 



t 



n3 J 

cx> 
« — I 

X 



- M 

PQ 



LO 

CO" 

OvJ_ 

Cvj 



oo 



On3 



CO 



CO 



oo~ 



oo 



CM" 




CQ 





oo 


oo 






Osj 




LO 


<^ 


oo 




^-^ 




X 


X 




cu 










O 









-•- 


-4- 




•4- 


-4- 




1 


-•- 




oo 


00 




oo 


dD 




cr> 




t — 1 


1— 1 










X 


X 


X 




CI4 





00 

CD 

X 

ex. 



o>i 
cx> 



^4 

O 



O 

Cs, ^ 

o 



00 

CNl 

o 



CQ CQ PQ OQ 



LO 

o 



- OQ 



X5 

o 
o 



.OQ , 



OsJ 


X3 




0 




0 


0 


k 


00 






or> 










0 





- CQ 



OQ 



OsJ 

O 



O -I- 
o + 



:x t 



CQ 


OQ 


CQ 


00 


00 






OsJ 




LO 




0^ 


•-^ 






X 


X 


►-^ 




ex. 


X 

cx, 



! L tL 

' X+x 
00 A 

00 T 

t-H 00 

CX, ^ 
X 

cx, 



CQ 
00 

OsJ 



:=3 

ro 
CO 

00 
1 — I 

X 

ex. 



M 
00 
« — I 

X 

ex. 



r-H 
Osl 
OsJ 

X 

ex. 



M 

c — ■ 

CM 

X 
CX4 
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13144-BP SEQUENCE FROM 429 TO 1328 cobQ GENE 

MTRRIMLQGTGSDVGKSVLVAGLCR 

ATGACACGCAGGATCATGTTGCAGGGAACCGGCTCGGATGTCGGAAAATCGGTATTGGTGGCGGGGCTCTGCCGG 

429 439 449 459 469 479 489 499 
LAANQGLKVRPFKPQNMSNNAAVSD 

CTTGCCGCCAATCAGGGCCTGAAGGTCCGGCCGTTCAAGCCGCAGAACATGTCGAACAACGCCGCCGTTTCCGAC 

504 514 524 534 544 554 564 574 

DGGEIGRAQWLQALAARVPSSVHMN 

GACGGCGGCGAGATCGGCCGCGCGCAATGGCTGCAGGCGCTGGCCGCGCGCGTGCCGTCGTCGGTGCACATGAAC 

579 589 599 609 619 629 639 649 
PVLLKPQSDVGSQIVVQGKVAGQAR 

CCGGTGCTCCTGAAGCCGCAGTCGGACGTGGGCAGCCAGATCGTCGTTCAGGGCAAGGTCGCCGGGCAGGCCAGG 

654 664 674 684 694 704 714 724 

GREYQALKPKLLGAVMESFEQISAG 

GGGCGGGAATATCAGGCGCTCAAGCCCAAGCTGCTGGGCGCCGTCATGGAGAGTTTCGAACAAATATCGGCCGGT 

729 739 749 759 769 779 789 799 

ADLVVVEGAGSPAEINLRPGDIANM 

GCCGATCTCGTGGTGGTCGAAGGCGCCGGCTCGCCGGCCGAAATCAACCTCAGGCCCGGCGACATCGCCAATATG 

804 814 824 834 844 854 864 874 
GFATRANVPVVLVGDI DRGGVIRSL 

GGCTTTGCGACACGGGCCAATGTGCCGGTCGTGCTGGTCGGCGACATCGACCGCGGGGGGGTGATCGCCTCGCTG 

879 889 899 909 919 929 939 949 
VGTHAILPEEDRRMVTGYLINKFRG 

GTCGGCACGCATGCGATCCTGCCCGAGGAAGACCGGCGCATGGTGACCGGCTATCTCATCAACAAGTTCCGCGGC 

954 964 974 984 994 1004 1014 1024 

DVTLFDDGIAAVNRYTGWPCFGVVP 

GACGTCACGCTGTTCGACGACGGCATTGCTGCCGTCAACCGCTACACCGGCTGGCCCTGCTTCGGCGTCGTGCCG 

1029 1039 1049 1059 1069 1079 1089 1099 
WLKAAARLPAEDSVVLEKLTRGEGR 

TGGCTGAAGGCGGCGGCACGCCTGCCGGCGGAAGATTCCGTCGTGCTGGAGAAGCTGACGCGCGGCGAGGGGCGG 

1104 1114 1124 1134 1144 1154 1164 1174 
ALKVAVPVLSRIANFDDLDPLAAEP 

GCGCTGAAGGTTGCCGTCCCGGTACTGTCGCGCATCGCCAATTTCGACGACCTCGATCCGCTCGCCGCCGAACCG 

1179 1189 1199 1209 1219 1229 1239 1249 
EIDLVFVRPGSPI PVDAGLVVIPGS 

GAGATTGATCTCGTCTTCGTGCGGCCTGGCAGTCCCATTCCGGTCGACGCTGGCCTCGTCGTCATTCCCGGGTCG 

1254 1264 1274 1284 1294 1304 1314 1324 



FIG. 47A 
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13144-BP SEQUENCE FROM 1329 TO 1886 cobQ GENE 



KSTIGDLIDFRAQGWDRDLERHVRR 
AMTCGACCATCGGCGACCTCATCGATTTCCGTGCGCAAGGGTGGGACCGTGACCTCGMCGTCATGTGCGCCGG 
1329 1339 1349 1359 1369 1379 1389 1399 
GGRVIGICGGYQMLGRRVTDPLGIE 
GGCGGCCGGGTCATCGGCATCTGCGGCGGCTACCAGATGCTCGGCCGGCGCGTCACCGATCCGCTCGGCATCGAG 
1404 1414 1424 1434 1444 1454 1464 1474 
GGERAVEGLGLLEVETEMAPEKTVR 
GGCGGCGAACGTGCGGTCGAGGGCCTCGGGCTGCTCGAGGTCGAGACCGAGATGGCGCCGGAAAAGACGGTGCGC 
1479 1489 1499 1509 1519 1529 1539 1549 
NSRAWSLEHDVVLEGYEIHLGKTQG 
AACAGCCGCGCCTGGTCGCTGGAGCATGATGTGGTGCTCGAAGGCTACGAAATCCATCTTGGCAAGACGCAAGGT 
1554 1564 1574 1584 1594 1604 1614 1624 
ADCGRPSVRIDNRADGALSADGRVM 
GCGGACTGTGGCCGGCCGTCGGTGCGCATCGACAATCGCGCCGACGGCGCCCTTTCGGCCGAT6GCCGCGTGATG 
1629 1639 1649 1659 1669 1679 1689 1699 
GTYLHGLFTSDAYRGALLKSFGIEG 
GGCACCTACCTGCATGGGCTCTTCACCAGCGACGCCTATCGCGGCGCGCTGCTCAAGAGTTTCGGCATCGAAGGC 
1704 1714 1724 1734 1744 1754 1764 1774 
GANNYRQSVDAALDDVANELEAVLD 
GGCGCCAACAACTACCGCCAATCGGTCGATGCGGCGCTCGACGATGTCGCGAACGAACTGGAGGCTGTGCTCGAT 
1779 1789 1799 1809 1819 1829 1839 1849 
RRWLDELLRH* (SEQ ID NO:43) 
CGTCGCTGGCTGGACGAGTTGCTCAGGCACTAG (SEQ ID NO: 42) 
1854 1864 1874 1884 



FIG. 47B 
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COBQ PROTEIN 



FIRST RESIDUE = 1 
LAST RESIDUE = 485 









NUMBER 


NO.% 


WEIGHT 


WEIGHT 


1 


PHE 


F 


1 1 
11 


2 .27 


1617 .75 


3.11 


2 


LEU 


L 


50 


10.31 


5654 .20 


10.88 


3 


ILE 


I 


23 


4 . 74 


2600 . 93 


5.01 


4 


MET 


M 


10 


2 . 06 


1310 .41 


2.52 


5 


VAL 


V 


50 


10.31 


4953 . 42 


9. 53 


6 


SER 


S 


24 


4 . 95 


2088 .77 


4 . 02 


7 


PRO 


P 


23 


4 . 74 


2232 .21 


4.30 


8 


THR 


T 


15 


3.09 


1515.72 


2. 92 


9 


ALA 


A 


49 


10 . 10 


3480 .82 


6.70 


10 


TYR 


Y 


8 


1. 65 


1304.51 


2.51 


1 1 

11 




'k 


0 


0.00 


0.00 


0.00 


Iz 


HIS 


H 


7 


1 .44 


959 .41 


1. 85 


13 


GLN 


Q 


15 


3.09 


1920.88 


3.70 


14 


ASN 


N 


16 


3.30 


1824.69 


3.51 


15 


LYS 


K 


15 


3.09 


1921.42 


3.70 


16 


ASP 


D 


34 


7.01 


3910.92 


7.53 


17 


GLU 


E 


28 


5.11 


3613.19 


6.96 


18 


CYS 


C 


4 


0.82 


412.04 


0.79 


19 


TRP 


W 


6 


1.24 


1116.48 


2.15 


20 


ARC 


R 


40 


8.25 


6244.04 


12.02 


21 


GLY 


G 


57 


11.75 


3250.22 


6.26 


22 






0 


0.00 


0.00 


0.00 



RESIDUES 

MOLECULAR WEIGHT (MONOISOTOPIC) 
MOLECULAR WEIGHT (AVERAGE) 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 



485 

51950.1016 
51982.3711 
40.00 
6.16 



OD 260 (Img/ml) = 0.558 OD 280 (Img/ml) = 0.825 



2.10 



-2.10 



COBQ FROM 1 TO 485 




48 96 144 192 240 288 336 384 432 480 

FIG. 47C 
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13144-BP SEQUENCE FROM 3364 TO 3888 cobP GENE 



MSSLSAGPVLVLGGARSGKSSFSER 
ATGAGCAGTCTCAGCGCCGGGCCCGTGCTGGTCCTTGGCGGCGCCCGTTCCGGCAAGTCCAGCTTTTCCGAGAGG 
3364 3374 3384 3394 3404 3414 3424 3434 
LVEASGFTMHYVATGRAWDDEMRER 

CTCGTCGAAGCGTCCGGCTTCACCATGCATTATGTCGCCACGGGCCGCGCCTGGGACGACGAAATGCGCGAGCGC 
3439 3449 3459 3469 3479 3489 3499 3509 
IDHHRTRRGEGWTTHEEPLDLVGIL 
ATCGACCATCACCGGACGCGCCGCGGCGAGGGCTGGACGACGCATGAGGAGCCGCTCGATCTCGTCGGCATCCTC 
3514 3524 3534 3544 3554 3564 3574 3584 
RRIDDPSHVVLIDCLTLWVTNLMLE 
AGACGCATCGATGATCCCAGCCATGTGGTCCTGATCGACTGCCTGACGCTATGGGTCACCAATCTCATGCTGGAA 
3589 3599 3609 3619 3629 3639 3649 3659 
ERDMTAEFAALVAYLPEARARLVFV 
GAGCGCGACATGACGGCGGAGTTCGCCGCCCTTGTTGCGTATCTGCCCGAGGCGCGGGCGCGCCTCGTCTTTGTT 
3664 3674 3684 3694 3704 3714 3724 3734 
SNEVGLGIVPENRMAREFRDHAGRL 
TCCAATGAGGTCGGCCTCGGCATCGTGCCCGAGAACCGCATGGCCCGCGAGTTTCGCGACCATGCCGGCCGGCTT 
3739 3749 3759 3769 3779 3789 3799 3809 

HQIVAEKSAEVYFVRAGLPLKMKG*- 
CACCAGATCGTTGCGGAGAAATCCGCTGAAGTTTACTTTGTCGCGGCCGGTTTGCCGCTGAAAATGAAGGGTTGAi 
3814 3824 3834 3844 3854 3864 3874 3884 J 
I4SEQ ID NO: 45) 
HSEQ ID NO: 44) 



FIG. 47D 
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COBP PROTEIN 



FIRST RESIDUE = 1 
LAST RESIDUE = 174 









NUMBER 


NO.% 


WEIGHT 


WEIGHT 


1 


PHE 


F 


5 


O AC 

i . 4b 


AAA /II 

ooz . 41 


A C A 

4 . 54 


2 


LEU 


L 


19 


lu . yz 


zl4o • oU 


11.06 


3 


ILE 


I 


6 


O /Id 


6 /o . bU 


O /I A 

0,49 


4 


MET 


M 


1 


4 . Uz 


A -| T A A 

91 / . zy 


A TO 

4 . /z 


5 


VAL 


V 


16 


9 . zU 


ibob . uy 


0 • 1 0 


6 


SER 


S 


11 


o o 

6 . 32 


ACT O C 

957 . 35 


A no 
4 • 93 


7 


PRO 


P 


6 


O /I c 

3 . 45 


C O A OA 

boz . 3z 


o An 
3 . 00 


8 


THR 


T 


8 


4 . 60 


AAA O O 

808 . 38 


/I 1 

4.16 


9 


ALA 


A 


17 


9 . 77 


T A A ^7 O 

1207 . 63 


r o o 

6 . zz 


10 


TYR 


Y 


3 


1 . 72 


^ A A 1 A 

489 • 19 


O CO 

2 . 52 


11 


ic 


"k 


0 


A A A 

0 . 00 


A A A 

0 . 00 


A A A 

0 . 00 


12 


Ti T r> 

HIS 


H 


1 


/I r\ o 




A Q /I 

4 . y4 


13 


GLN 


Q 


1 


0.57 


128.06 


0.66 


14 


ASN 


N 


3 


1.72 


342.13 


1.76 


15 


LYS 


K 


4 


2.30 


512.38 


2.64 


16 


ASP 


D 


9 


5.17 


1035.24 


5.33 


17 


GLU 


E 


16 


9.20 


2064.68 


10.63 


18 


CYS 


C 


1 


0.57 


103.01 


0.53 


19 


TRP 


W 


3 


1.72 


558.24 


2.87 


20 


ARG 


R 


17 


9.77 


2653.72 


13.66 


21 


GLY 


G 


14 


8.05 


798.30 


4.11 


22 






0 


0.00 


0.00 


0.00 



RESIDUES 

MOLECULAR WEIGHT (MONOISOTOPIC) = 
MOLECULAR WEIGHT .(AVERAGE) 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 



174 

19429. 9473 
19442.2637 
43.68 
6.71 



OD 260 (Img/ml) = 0.720 OD 280 (Img/ml) = 1.042 



COBP FROM 1 TO 174 



2.40 




-2.40 



17 34 51 68 85 102 119 136 153 170 
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13144-BP SEQUENCE FROM 3892 TO 4956 cobW GENE 



M T T A R A N Q G K I P A T V I T G F L G 'a G K T 

ATGACCACTGCGAGAGCCAACCAGGGCAAGATCCCGGCGACCGTCATCACCGGCTTCCTCGGCGCCGGCAAGACG 
3892 3902 3912 3922 3932 3942 3952 3962 

TMIRNLLQNADGKRIGLIINEFGDL 

ACGATGATCCGCAACCTGCTGCAGAACGCCGACGGCAAGCGCATCGGCCTGATCATCAACGAGTTCGGCGATCTT 
3967 3977 3987 3997 4007 4017 4027 4037 

GVDGDVLKGCGAEACTEDDIIELTN 
GGCGTCGACGGCGATGTCTTGAAGGGCTGCGGTGCCGAGGCCTGCACCGAGGACGACATCATCGAGCTCACCAAT 
4042 4052 4062 4072 4082 4092 4102 4112 

G C I C C T V A D D F I P T M T K L L E R E N R P 

GGCTGCATCTGCTGCACCGTGGCTGACGATTTCATCCCGACCATGACGAAGCTGCTCGAGCGTGAAAACCGTCCT 
4117 4127 4137 .4147 4157 4167 4177 4187 

DHIIIETSGLALPQPLIAAFNWPDI 
GACCACATCATCATCGAAACCTCGGGCCTTGCCCTGCCGCAGCCGCTGATCGCCGCTTTCAACTGGCCGGATATC 
4192 4202 4212 4222 4232 4242 4252 4262 

RSEVTVDGVVTVVDSAAVAAGRFAD 

CGCAGCGAAGTGACCGTCGATGGCGTCGTCACCGTGGTCGACAGCGCCGCCGTTGCCGCTGGCCGCTTTGCCGAC 
4267 4277 4287 4297 4307 4317 4327 4337 

DHDKVDALRVEDDNLDHESPIEELF 
GACCACGACAAGGTCGATGCGCTGCGCGTCGAGGACGACAATCTCGATCACGAAAGCCCGATCGAGGAGCTGTTC 
4342 4352 4362 4372 4382 4392 4402 4412 

EDQLTAADLIVLNKTDLIDASGLKA 
GAGGATCAACTGACGGCTGCCGATCTCATCGTTCTCAACAAGACCGATCTGATCGATGCCTCCGGCCTCAAGGCC 
4417 4427 4437 4447 4457 4467 4477 4487 

VRDEVSSRTSRKPTMIEAKNGEVAA 
GTGCGCGACGAGGTGTCTTCGCGCACCAGCCGCAAGCCCACGATGATCGAGGCGAAAAACGGCGAAGTCGCCGCT 
4492 4502 4512 4522 4532 4542 4552 .4562 

AI LLGLGVGTESDIANRKSHHEMEH 
GCCATCCTGCTTGGCCTCGGTGTCGGCACGGAAAGCGATATCGCCAACCGCAAGTCGCATCACGAGATGGAGCAC 

4567 4577 4587 4597 4607 4617 4627 4637 
EA GEEHDHDEFDSFVVELGSIADPA 
GAGGCAGGTGAGGAGCACGATCACGACGAGTTCGACAGCTTCGTCGTCGAGCTCGGTTCGATCGCCGATCCGGCC 

4642 4652 4662. 4672 4682 4692 4702 4712 
AF IDRLKGV lAEHDVLR LKGFADVP 
GCCTTCATCGATCGCCTGAAGGGCGTAATCGCGGAGCACGACGTTCTGCGCCTCAAGGGTTTTGCAGACGTGCCC 

4717 4727 4737 4747 4757 4767 4777 4787 
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GKPMRLLIQRVGARIDQYYDRAWGA 
GGCAAGCCGATGCGCCTCCTGATCCAGGCGGTCGGCGCCCGCATCGACCAATATTACGACCGCGCCTGGGGCGCT 
4792 4802 4812 4822 4832 4842 4852 4862 

GEKRGTRLVVIGLHDMDEAAVRAAI 
GGCGAAAAGCGCGGTACGCGCCTCGTCGTCATCGGCCTGCACGACATGGACGAGGCGGCGGTGCGCGCCGCGATC 
4867 4877 4887 4897 4907 4917 4927 4937 
T A L V * (SEQ. ID. NO: 47) 
ACCGCGCTCGTGTAG (SEQ. ID. NO: 46) 
4942 4952 
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COBW PROTEIN 



FIRST RESIDUE = 1 
LAST RESIDUE = 354 









NUMBER 


NO.% 


WEIGHT 


1 


PHE 


F 


10 


2 . 82 


1470 . 68 


o 
I 


LEU 


T 

L 


32 


9 . 04 


/" 1 A /"A 

3618 . 69 


i 


ILE 


T 
I 


28 


1 A 1 

7 • 91 


3166. 35 


A 

4 


MET 


M 


7 


1 A A 

1 . 98 


All A A 

917 .28 


5 


T 7 TV T 

VAL 


V 


28 


T A 1 

7 . 91 


A T T A A A 

2773 . 92 


6 


O 7~» 

SER 


s 


12 


3 . 39 


1 A ^ ^ A A 

1044 . 38 


7 


PRO 


p 


11 


3 . 11 


1 A ^ T I~ A 

1067 .58 


8 


THR 


m 

T 


21 


(— rv 

5. 93 


2122 . 00 


9 


TV T TV 

ALA 


A 


41 


11 . 58 


2912 .52 


10 


TYR 


Y 


2 


r\ T~ /~ 

0.56 


326.13 


1 1 

11 




■J* 

X 


0 


0.00 


0.00 




nib 


o 
H 


lU 




IOTA C A 

13/0.59 


13 


GLN 


Q 


6 


1.69 


768.35 


14 


ASN 


N 


11 


3.11 


1254.47 


15 


LYS 


K 


15 


4.24 


1921.42 


16 


ASP 


D 


36 


10.17 


4140.97 


17 


GLU 


E 


27 


7.63 


3484.15 


18 


CYS 


C 


5 


1.41 


515.05 


19 


TRP 


W 


2 


0.56 


372.16 


20 


ARG 


R 


20 


5.65 


3122.02 


21 


GLY 


G 


30 


8.47 


1710.64 


22 






0 


0.00 


0.00 



3.86 
9.50 
8.31 



41 
28 
74 
80 



38097 
38121 



RESIDUES 

MOLECULAR WEIGHT (MONOISOTOPIC) = 
MOLECULAR WEIGHT (AVERAGE) 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 

OD 260 (Img/ml) = 0.268 OD 280 (Img/ml) = 



2 
7 
2 
2 

5.57 
7.64 
0.86 
0.00 
3.60 
2.02 
3.29 
5.04 
10.87 
9.15 
1.35 
0.98 
8.19 
4.49 
0.00 
354 
.4258 
.1055 
44.63 
4.90 
0.354 



2.40 



COBW FROM 1 TO 354 




-2.40 



35 70 105 140 175 210 245 280 315 350 
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13144-BP SEQUENCE FROM 5060 TO 8887 cobN GENE 

MHLLLAQKGTIADGNEAIDLGQTPA 
ATGCATCTGCTTCTCGCCCAGAAAGGAACGATCGCCGACGGCAACGAGGCGATCGACCTTGGGCAAACGCCGGCC 
5060 5070 5080 5090 5100 5110 5120 5130 
DILFLSAADTELSSIAAAHGRRDGG 
GATATCCTTTTCCTATCGGCCGCCGACACCGAGCTCTCCTCGATCGCCGCGGCTCACGGCCGACGCGACGGAGGC 
5135 5145 5155 5165 5175 5185 5195 5205 
LSLRIASLMSLMHPMSVDTYVERTA 
TTGAGCCTGCGCATCGCCAGCCTGATGAGCCTGATGCACCCGATGTCGGTCGACACTTACGTCGAGCGCACGGCG 
5210 5220 5230 5240 5250 5260 5270 5280 
RHAKLIVVRPLGGASYFRYLLEALH 
CGTCACGCCAAGCTGATCGTCGTCCGGCCGCTCGGTGGCGCCAGCTATTTCCGTTATCTGCTGGAAGCCCTGCAT 

5285 5295 5305 5315 5325 5335 5345 5355 
AAAVTHRFEIAVLPGDDKPDPGLEP 

GCGGCTGCCGTCACCCATCGTTTCGAGATCGCGGTTCTGCCGGGTGACGACAAGCCGGATCCGGGGCTGGAGCCT 
5360 5370 5380 5390 5400 5410 5420 5430 
FSTVAADDRQRLWAYFTEGGSDNAG 
TTCTCCACCGTCGCAGCCGACGACCGCCAGCGCCTTTGGGCTTACTTCACCGAAGGCGGCTCGGACAATGCCGGG 
5435 5445 5455 5465 5475 5485 5495 5505 
LFLDYAAALVTGAEKPQPAKPLLKA 
CTGTTTCTCGACTATGCGGCCGCACTGGTCACAGGTGCGGAGAAGCCGCAGCCGGCAAAGCCCCTGTTGAAGGCC 
5510 5520 5530 5540 5550 5560 5570 5580 
GIWWPGAGVIGVSEWQSLVQGRMVA 
GGCATCTGGTGGCCGGGTGCTGGTGTGATCGGCGTCAGCGAATGGCAGTCCCTTGTTCAGGGACGGATGGTAGCG 
5585 5595 5605 5615 5625 5635 5645 5655 
REGFEPPTVGICFYRALVQSGETRP 
AGGGAGGGATTCGAACCCCCGACGGTCGGGATCTGCTTTTACCGCGCGCTCGTGCAGAGTGGCGAGACACGGCCT 
5660 5670 5680 5690 5700 5710 5720 5730 
V.EALIDALEAEGVRALPVFVSSLKD 
GTGGAGGCGCTGATCGATGCGCTGGAGGCTGAAGGTGTGCGGGCACTGCCGGTGTTTGTCTCAAGCCTCAAGGAT 
5735 5745 5755 5765 5775 5785 5795 5805 
AVSVGTLQAIFSEAAPDVVMNATGF 
GCCGTTTCCGTCGGCACGCTGCAGGCGATTTTTTCCGAGGCCGCACCCGACGTGGTGATGAACGCCACTGGCTTT 

5810 5820 5830 5840 5850 5860 5870 5880 
AVSSPGADRQPTVLESTGAPVLQVI 

GCGGTCTCGTCGCCCGGTGCCGACCGTCAGCCGACGGTGCTGGAATCGACCGGTGCGCCGGTGCTGCAGGTGATT 

5885 5895 5905 5915 5925 5935 5945 5955 
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FSGSSRAQWETSPQGLMARDLAMNV 

TTCTCCGGCTCGTCGCGGGCGCAATGGGAAACGTCGCCGCAGGGGCTGATGGCGCGCGACCTCGCCATGAACGTG 
5960 5970 5980 5990 6000 6010 6020 6030 

ALPEVDGRILARRVSFKAASIYDAK 
GCACTCCCCGAAGTCGATGGCCGCATCCTTGCGCGCGCCGTCTCCTTCAAGGCGGCGTCGATCTATGACGCCAAG 
6035 6045 6055 6065 6075 6085 6095 6105 
VEANIVGHEPLEGRVRFAADLAVNW 
GTGGAGGCCAATATCGTCGGCCATGAGCCGCTCGAAGGCCGGGTGCGCTTTGCCGCTGATCTTGCCGTCAACTGG 
6110 6120 6130 6140 6150 6160 6170 6180 
ANVRRAEPAERRIAIVMANYPNRDG 
GCGAACGTGCGCCGGGCAGAGCCCGCCGAGCGCCGTATTGCCATCGTCATGGCCAACTATCCGAACCGCGACGGT 
6185 6195 6205 6215 6225 6235 6245 6255 
RLGNGVGLDTPAGTVEVLSAMAREG 
CGCCTCGGCAACGGTGTCGGGCTCGACACGCCGGCCGGTACCGTCGAGGTGCTTAGCGCCATGGCGCGGGAAGGC 
6260 6270 6280 6290 6300 6310 6320 6330 
YAVGEVPADGDALIRFLMAGPTNAA 
TATGCGGTCGGTGAGGTTCCCGCCGATGGCGACGCGCTGATCCGCTTTCTGATGGCCGGGCCGACCAATGCGGCG 
6335 6345 6355 6365 6375 6385 6395 6405 
SHDREIRERISLNDYKTFFDSLPKQ 
AGCCATGACCGTGAAATCCGCGAGCGTATTTCGCTGAACGATTACAAAACGTTCTTCGATTCGCTTCCGAAACAG 
6410 6420 6430 6440 6450 6460 6470 6480 

ikdevagrwgvpeadpffldgafal' 
ataaaggatgaagttgccggtcgctggggcgtgccggaggccgatccctttttcctcgatggcgccttcgcgctg 

6485 6495 6505 6515 6525 6535 6545 6555 

plarfgevivgiqpargynidpkes 
ccgctcgcccgcttcggcgaggtgatcgtcggcatccaaccggcgcgcggctacaacatcgatccgaaggaaagc 

6560 6570 6580 6590 6600 6610 6620 6630 
YHSPDLVPPHGYLAFYAFLRQQFGA 
TACCATTCCCCGGACCTCGTGCCGCCGCATGGCTATCTCGCCTTCTACGCCTTCCTGCGCCAGCAGTTCGGAGCG 
6635 6645 6655 6665 6675 6685 6695 6705 
QAIVHMGKHGNLEWLPGKALALSET 
CAGGCGATCGTCCACATGGGCAAGCACGGCAATCTCGAATGGCTGCCGGGCAAGGCGCTGGCGCTGTCGGAAACC 
6710 6720 6730 6740 6750 6760 6770 6780 
CYPEAIFGPLPHIYPFIVNDPGEGT 
TGCTATCCCGAAGCGATCTTCGGGCCGCTGCCGCACATCTATCCCTTCATCGTCAACGATCCGGGCGAAGGTACG 
6785 6795 6805 6815 6825 6835 6845 6855 

F/G. 47 J 
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QAKRRTSAVIIDHLTPPLTRAESYG 
CAGGCCAAGCGCCGCACCAGCGCCGTCATCATCGACCACCTGACCCCGCCCTTGACGCGCGCCGAATCCTACGGC 
6860 6870 6880 6890 6900 6910 6920 6930 

PLKDLEALVDEYYDAAGGDPRRLRL 
CCGCTCAAGGATCTGGAAGCGCTCGTCGACGAATATTACGACGCCGCCGGCGGTGATCCGCGCCGCCTCAGGCTG 
6935 6945 6955 6965 6975 6985 6995 7005 

LSRQILDLVRDIGLDSDAGIDRGDS 
CTCAGCCGCCAGATCCTCGATCTCGTGCGCGACATCGGCCTCGACAGCGACGCAGGCATCGACAGGGGCGACAGC 
7010 7020 7030 7040 7050 7060 7070 7080 

DDKALEKLDAYLCDLKEMQIRDGLH 
GACGACAAGGCGCTGGAAAAGCTCGACGCCTATCTCTGCGACCTCAAGGAAATGCAGATCCGCGACGGCCTGCAC 
7085 7095 7105 7115 7125 7135 7145 7155 

IFGVAPEGRLLTDLTVALARVPRGL 
ATCTTCGGCGTTGCGCCGGAAGGGCGGTTGTTGACGGACCTCACCGTAGCGCTGGCGCGCGTGCCCCGAGGTCTC 
7160 7170 7180 7190 7200 7210 7220 7230 

GEGGDQSLQRAIAADAGLRGFAIPT 
GGCGAGGGCGGCGACCAGAGCCTGCAGCGGGCGATCGCAGCGGATGCGGGGCTGCGTGGGTTTGCTATTCCCACC 
7235 7245 7255 7265 7275 7285 7295 7305 
SAG G N PARDAQPFDPLDCVMSDTWT 
TCGGCGGGGGGCAACCCCGCACGCGACGCCCAACCCTTCGACCCGCTCGACTGCGTCATGTCCGACACCTGGACA 
7310 7320 7330 7340 7350 7360 7370 7380 

GPKPSILADLSDAPWRTAGDTVERI 
GGCCCGAAACCGTCCATCCTCGCTGACCTCTCGGACGCCCCCTGGCGCACCGCCGGCGATACGGTCGAGCGCATC 
7385 7395 7405 7415 7425 7435 7445 7455 

ELLAANLVSGELACPDHWANTRAVL 
GAGTTGCTTGCCGCAAATCTCGTGTCGGGTGAACTGGCTTGCCCGGACCACTGGGCCAACACCCGCGCCGTGCTC 
7460 7470 7480 7490 7500 7510 7520 7530 

GEIETRLKPSISNSGAAEMTGFLTG 
GGCGAAATCGAAACGCGCCTGAAGCCGTCGATTTCAAACTCGGGTGCCGCCGAGATGACCGGCTTCCTCACCGGT 
7535 7545 7555 7565 7575 7585 7595 7605 

LSGRFVAPGPSGAPTRGRPDVLPTG 
CTCAGCGGCCGCTTCGTCGCCCCCGGTCCATCGGGCGCGCCGACGCGCGGCCGGCCGGATGTGTTGCCGACGGGG 
7610 7620 7630 7640 7650 7660 7670 7680 

RNFYSVDSRAVPTPAAYELGKKSAE 
CGCAATTTCTACTCGGTCGACAGCCGCGCCGTGCCGACGCCGGCGGCTTACGAGCTTGGCAAGAAATCGGCCGAG 
7685 7695 7705 7715 7725 7735 7745 7755 
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LLIRRYLQDHGEWPSSFGLTAWGTA 
CTTCTGATCCGCCGCTACCTGCAGGACCATGGCGAATGGCCGTCCTCCTTTGGCCTGACCGCCTGGGGCACGGCG 
7760 7770 7780 7790 7800 7810 7820 7830 

NMRTGGDDIAQALALIGAKPTWDMV 
AACATGCGCACCGGCGGCGACGACATCGCCCAGGCCCTGGCGCTGATCGGCGCCAAGCCCACCTGGGACATGGTC 
7835 7845 7855 7865 7875 7885 7895 7905 
SRRVMGYEIVPLAVL6RPRVDVTLR 
TCTCGCCGGGTGATGGGCTACGAGATCGTGCCGCTCGCAGTCCTCGGCCGCCCACGCGTCGACGTGACCTTGCGC 
7910 7920 7930 7940 7950 7960 7970 7980 
ISGFFRDAFPDQIALFDKAIRAVAL 
ATTTCCGGCTTCTTCCGCGATGCCTTCCCGGACCAGATCGCGCTCTTCGACAAGGCGATCCGCGCCGTCGCGCTG 
7985 7995 8005 8015 8025 8035 8045 8055 
EEDDADNMIAARMRAESRRLEAEGV 
GAGGAAGACGATGCCGACAACATGATCGCCGCACGCATGCGGGCGGAAAGCCGGCGGCTGGAGGCCGAAGGCGTG 
8060 8070 8080 8090 8100 8110 8120 8130 
EAAEAARRAS YRVFGRKPGAYGAAL 
GAAGCCGCCGAGGCCGCGCGTCGCGCCTCCTACCGCGTCTTTGGCGCAAAGCCCGGTGCCTATGGCGCCGCCCTG 
8135 8145 8155 8165 8175 8185 8195 8205 
■QALIDEKGWETKADLAEAYLTWGAY 
CAGGCGCTGATCGACGAGAAGGGCTGGGAAACCAAAGCAGATCTCGCCGAGGCCTATCTTACCTGGGGCGCCTAT 
8210 8220 8230 8240 8250 8260 8270 8280 
AYGAGEEGKAERDLFEERLRTIEAV 
GCCTATGGCGCCGGCGAGGAGGGCAAGGCCGAGCGCGATCTTTTCGAGGAGCGCCTGCGCACGATAGAGGCGGTG 
8285 8295 8305 8315 8325 8335 8345 8355 

VQNQDNREHDLLDSDDY YQFEGGMS 
GTGCAGAACCAGGACAACCGCGAGCACGATCTGCTCGACAGCGACGACTACTACCAGTTCGAAGGCGGCATGAGC 
8360 8370 8380 8390 8400 8410 8420 S430 
AAAEQLGGHRPAIYHNDHSRPEKPV 
GCTGCCGCCGAACAGCTCGGCGGTCACCGTCCGGCGATCTACCACAACGACCATTCCCGTCCGGAAAAGCCTGTG 
8435 8445 8455 8465 8475 8485 8495 8505 
IRSLEEEIGRVVRARVVNPKWIDGV 
ATCCGGTCGCTCGAAGAAGAGATCGGCCGCGTGGTCCGGGCCCGCGTCGTCAATCCCAAGTGGATCGATGGCGTC 
8510 8520 8530 8540 8550 8560 8570 8580 
MRHGYKGAFEIAATVDYMFAFAATT 
ATGCGCCACGGATACAAGGGCGCCTTCGAGATCGCTGCCACGGTCGACTACATGTTCGCCTTTGCCGCGACCACG 
8585 8595 8605 8615 8625 8635 8645 - 8655 
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GAVRDHHFEAAYQAFIVDERVADFM 
GGTGC(^TGCGCGACCATCATTTCGAGGCCGCTTATCAGGCGTTCATTGTCGACGAGCGCGTG^^^ 

8660 8670 8680 8690 8700 8710 8720 8730 
RDKNPAAFAELAERLLEAIDRNLWT 
CGCGACAAGAACCCGGCCGCCTTTGCCGAGCTTGCCGAACGCCTGCTTGAAGCAATCGACCGCAATCTCTGGACG 

8735 8745 8755 8765 8775 8785 8795 8805 

P R S N S A R F E L A G I G T A A T R L R A G N E 
CCGCGCTCGAATTCGGCGCGGTTTGAACTTGCCGGCATCGGCACGGCAGCAACCCGGCTTCGTGCCGGCAATGAA 

8810 8820 8830 8840 8850 8860 8870 8880 

* (SEQ ID NO: 49) 

TAG (SEQ ID NO: 48) 

8885 
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COBN PROTEIN 



FIRST RESIDUE = 1 
LAST RESIDUE = 1275 









NIIMRFR 


NO % 


WEIGHT 


WFTHHT 


1 


PHE 


F 


48 


3.76 


7059.28 


5.12 


2 


LEU 


L 


121 


9.49 


13683.17 


9.92 


3 


ILE 


I 


60 


4.71 


6785.04 


4.92 


4 


MET 


M 


24 


1.88 


3144. 97 


2.28 


5 


VAL 


V 


82 


6.43 


8123.61 


5.89 


6 


SER 


S 


64 


5.02 


5570.05 


4.04 


7 


PRO 


P 


76 


5.96 


7376.01 


5.35 


8 


THR 


T 


53 


4.16 


5355.53 


3.88 


9 


ALA 


A 


180 


14.12 


12786.68 


9.27 


10 


TYR 


Y 


35 


2.75 


5707.22 


4.14 


11 


* 


* 


0 


0.00 


0.00 


0.00 


12 


HIS 


H 


24 


1.88 


3289.41 


2.38 


13 


GLN 


Q 


32 


2.51 


4097.87 


2.97 


14 


ASN 


N 


30 


2.35 


3421.29 


2.48 


15 


LYS 


K 


34 


2.67 


4355.23 


3.16 


16 


ASP 


D 


90 


7.06 


10352.42 


7.50 


17 


GLU 


E 


85 


6.67 


10968.62 


7.95 


18 


CYS 


C 


5 


0.39 


515.05 


0.37 


19 


TRP 


W 


18 


1.41 


3349.43 


2.43 


20 


ARG 


R 


99 


7.76 


15454.01 


11.20 


21 


GLY 


G 


115 


9.02 


6557.47 


4.75 


22 






0 


0.00 


0.00 


0.00 



RESIDUES 

MOLECULAR WEIGHT (MONOISOTOPIC) 
MOLECULAR WEIGHT (AVERAGE) 
INDEX OF POLARITY (%) 
ISOELECTRIC POINT 



1275 
137970.5000 
138055.8594 
40.08 
5.42 



OD 260 (Img/ml) = 0.693 OD 280 (Img/ml) = 1.027 
COBN FROM 1 TO 1275 




-2.40 



127 254 381 508 635 762 889 1016 1143 1270 
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13144-BP SEQUENCE 9034 TO 9678 cobO GENE 



MSDETTVGGEAPAEKDDARHAMKMA 
ATGAGCGACGAGACGACAGTAGGCGGCGAAGCCCCGGCCGAGAAGGACGATGCCCGCCACGCCATGAAGATGGCG 
9034 9044 9054 9064 9074 9084 9094 9104 

KKKAAREKIMATKTDEKGLIIVNTG 
AAGAAGAAGGCAGCACGCGAAAAGATCATGGCGACGAAGACCGACGAGAAGGGTCTGATCATCGTCAACACCGGC 
9109 9119 9129 9139 9149 9159 9169 9179 

KGKGKSTAGFGMIFRHIAHGMPCAV 
AAAGGCAAGGGCAAGTCGACCGCCGGCTTCGGCATGATCTTCCGCCATATCGCCCACGGCATGCCCTGCGCCGTC 
9184 9194 9204 9214 9224 9234 9244 9254 

VQFIKGAMATGERELIEKHFGDVCQ 
GTGCAGTTCATCAAGGGTGCGATGGCAACCGGCGAGCGCGAGTTGATCGAGAAGCATTTCGGCGATGTCTGCCAG 
9259 9269 9279 9289 9299 9309 9319 9329 

FYTLGEGFTWETQDRARDVAMAEKA 
TTCTACACGCTCGGCGAGGGCTTCACCTGGGAAACGCAGGATCGCGCCCGCGATGTTGCGATGGCTGAAAAGGCC 
9334 9344 9354 9364 9374 9384 9394 9404 

WEKAKELIRDERNSMVLLDEINIRL 
TGGGAGAAGGCGAAGGAACTGATCCGTGACGAGCGCAACTCGATGGTGCTGCTCGACGAGATCAACATTGCTCTG 
9409 9419 9429 9439 9449 9459 9469 9479 

RYDYIDVAEVVRFLKEEKPHMTHVV 
CGCTACGACTACATCGACGTCGCCGAAGTGGTGCGCTTCCTGAAGGAAGAAAAGCCGCACATGACGCATGTGGTG 
9484 9494 9504 9514 9524 9534 9544 9554 

LTGRNAKEDLIEVADLVTEMELIKH 
CTCACCGGCCGCAACGCGAAAGAAGACCTGATCGAAGTCGCCGATCTCGTCACTGAGATGGAGCTGATCAAGCAT 
9559 9569 9579 9589 9599 9609 9619 9629 
PFRSGIKAQQGVEF* (SEQ ID NO: 51) 
CCGTTCCGTTCCGGCATCAAGGCGCAGCAGGGCGTGGAGTTCTGA (SEQ ID NO: 50) 
9634 9644 9654 9664 9674 
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NH2 -TERMINAL SEQUENCE OF SUMT OF M. ivanovii 

WYLVGAGPGDPEL ITLKAVNVLK-ADWL (AMINO ACID FRAGMENT 2-31 

923 946 OF SEQ ID NO: 54) 

SENSE OLIGONUCLEOTIDE 946 (27-mer) 

P G D P E L (AMINO ACIDS 10-15 OFSEQ ID NO: 54) 

5' CGCGGAATTC CCA GGA GAT CCA GAA CT 3' (SEQ ID NO: 56) 
EcoRI T T C T G 
C C C 
G G G 

SENSE OLIGONUCLEOTIDE 923 (27-mer) 

V Y L VGA (AMINO ACIDS 3-8 OF SEQ ID NO: 54) 
5' CGCGG AATTC GTA TAT CTA GTA GGA GC 3' (SEQ ID NO: 57) 
EcoRI G C T T T 

G 

NH2 -TERMINAL SEQUENCE OF A FRAGMENT DERIVED FROM A 
TRYPTIC DIGESTION OF M. ivanovii SUMT 

IITGTLENIAGK (AMINO ACIDS 201-212 OF 

947 SEQ ID NO: 54) 

ANTISENSE OLIGONUCLEOTIDE 947 (25-mer) 

N E L T G (C02-^NH2 ORIENTATION OF AMINO 

5' CGCG AAGCTT GTT TTC TAG AGT ACC 3' ACIDS' 207-203 OF SEQ ID NO: 54) 

Hindi 1 1 A C A T T (SEQ ID NO: 58) 

G C 



FIG. 48A 



CODING 

STRAND C:/ '■„„„■,■„ .nmnmn ^ „„. , „„„„„.„„„ „ ^ ^ ^, 

947 ^ 947 

946 946 
STRAND 3'-< = AFTER ONE 3^ * 5^ 

COMPLEMENTARY TO cYCLE OF ENZYMATIC 

THE CODING STRAND AMPLIFICATION 



M. ivanovii SUMT STRUCTURAL GENE (SEQ ID NO: 54) 

FIG. 48B 
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Hind Eco 

' ^ 615-bp FRAGMENT AFTER 

DIGESTION WITH EcoRI Hindi I I 



Hind Eco 




M13mpl9 DIGESTED WITH 
Hindi I I AND EcoRI 



LIGATION 

AND TRANSFORMATION 




SYNTHESISED DURING 
THE SEQUENCING REACTION 

Hj HYBRIDISATION SITE OF THE PRIMER -20 OF THE ss DNA 
OF PHAGE M13mpl9 

^ SEQUENCE COMPLEMENTARY TO THE SENSE 
OLIGONUCLEOTIDE 946 
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10 20 30 40 50 60 

CCATAATTCT TTTATAATTT AAACGGTGAA CACATGGTAG TTTATTTAGT AGGTGCGGGT 
GGTATTAAGA AAATATTAAA TTTGCCACTT GTGTACCATC AAATAAATCA TCCACGCCCA 

70 80 90 100 110 120 

CCAGGAGATC CCGAACTTAT CACTCTCAAA GCTGTAAACG TGTTAAAAAA AGCGGATGTT 
GGTCCTCTAG GGCTTGAATA GTGAGAGTTT CGACATTTGC ACAATTTTTT TCGCCTACAA 

130 140 150 160 170 180 

GTACTGTACG ACAAACCTGC AAATGAAGAA ATTTTAAAGT ATGCTGAAGG TGCAAAACTA 
CATGACATGC TGTTTGGACG TTTACTTCTT TAAAATTTCA TACGACTTCC ACGTTTTGAT 

190 200 210 220 230 240 

ATATATGTCG GAAAACAAGC AGGACATCAT TACAAATCTC AAAATGAAAT CAATACTCTT 
TATATACAGC CTTTTGTTCG TCCTGTAGTA ATGTTTAGAG TTTTACTTTA GTTATGAGAA 

250 260 270 280 290 300 

CTTGTTGAAG AAGCAAAAGA AAATGATTTA GTAGTACGCC TTAAAGGTGG AGACCCCTTT 
GAACAACTTC TTCGTTTTCT TTTACTAAAT CATCATGCGG AATTTCCACC TCTGGGGAAA 

310 320 330 340 350 360 

GTATTTGGAA GAGGAGGCGA GGAAATTCTG GCCCTTGTAG AAGAAGGAAT TGATTTTGAG 
CATAAACCTT CTCCTCCGCT CCTTTAAGAC CGGGAACATC TTCTTCCTTA ACTAAAACTC 

370 380 390 400 410 420 

TTAGTTCCAG GGGTAACTTC TGCAATTGGA GTTCCAACAA CAATTGGGCT TCCAGTTACT 
AATCAAGGTC CCCATTGAAG ACGTTAACCT CAAGGTTGTT GTTAACCCGA AGGTCAATGA 

430 440 450 460 470 480 

CATAGAGGTG TTGCAACATC GTTTACAGTT GTTACAGGTC ATGAAGACCC AACAAAATGC 
GTATCTCCAC AACGTTGTAG CAAATGTCAA CAATGTCCAG TACTTCTGGG TTGTTTTACG 

490 500 510 520 530 540 

AAGAAACAGG TAGGATGGGA CTTTAAAGCA GATACTATTG TAATACTTAT GGGTATTGGA 
TTCTTTGTCC ATCCTACCCT GAAATTTCGT CTATGATAAC ATTATGAATA CCCATAACCT 

550 560 570 580 590 600 

AATTTAGCTG AAAATACAGC AGAAATTATG AAACATAAAG ATCCTGAAAC TCCAGTTTGT 
TTAAATCGAC TTTTATGTCG TCTTTAATAC TTTGTATTTC TAGGACTTTG AGGTCAAACA 
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corA GENE AND CORA PROTEIN (SUMT) OF METHANOBACTERIUM 
IVANOVII SEQUENCE OF 955-BP FRAGMENT 
FROM 34 TO 729 



MVVYLVGAGPGDPELITLKAVNVLK 
ATGGTAGTTTATTTAGTAGGTGCGGGTCCAGGAGATCCCGAACTTATCACTCTCAAAGCTGTAAACGTGTTAAAA 
34 44 54 64 74 84 94 104 

KADVVLYDKPANEEILKYAEGAKLI 
AAAGCGGATGTTGTACTGTACGACAAACCTGCAAATGAAGAAATTTTAAAGTATGCTGAAGGTGCAAAACTAATA 
109 119 129 139 149 159 169 179 
YVGKQAGHHYKSQNEINTLLVEEAK 
TATGTCGGAAAACAAGCAGGACATCATTACAAATCTCAAAATGAAATCAATACTCTTCTTGTTGAAGAAGCAAAA 
184 194 204 214 224 234 244 254 

ENDLVVRLKGGDPFVFGRGGEEILA 
GAAAATGATTTAGTAGTACGCCTTAAAGGTGGAGACCCCTTTGTATTTGGAAGAGGAGGCGAGGAAATTCTGGCC 
259 269 279 289 299 309 319 329 

LVEEGIDFELVPGVT.SAIGVPTTIG 
CTTGTAGAAGAAGGAATTGATTTTGAGTTAGTTCCAGGGGTAACTTCTGCAATTGGAGTTCCAACAACAATTGGG 
334 344 354 364 374 384 394 404 

LPVTHRGVATSFTVVTGHEDPTKCK 
CTTCCAGTTACTCATAGAGGTGTTGCAACATCGTTTACAGTTGTTACAGGTCATGAAGACCCAACAAAATGCAAG 
409 419 429 439 449 459 469 479 
KQVGWDFKADTIVILMGIGNLAENT 
AAACAGGTAGGATGGGACTTTAAAGCAGATACTATTGTAATACTTATGGGTATTGGAAATTTAGCTGAAAATACA 
484 494 504 514 524 534 544 554 

AEIMKHKDPETPVCVIENGTMEGQR 
GCAGAAATTATGAAACATAAAGATCCTGAAACTCCAGTTTGTGTAATTGAAAATGGTACGATGGAAGGTCAAAGG 
559 569 579 589 599 609 619 629 

IITGTLENIAGKDIKPPALVVLEML 
ATAATAACGGGCACACTGGAAAATATAGCTGGAAAGGATATTAAACCTCCTGCTTTAGTGGTATTGGAAATGTTG 
634 644 654 664 674 684 694 704 

S M F L K K * 
TCAATGTTTTTAAAGAAATGA 
709 719 729 
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